Select Language

Towards Formal Verification of Password Generation Algorithms in Password Managers

A research paper proposing a formally verified reference implementation for random password generators in password managers, using EasyCrypt for functional correctness and security proofs.
computationalcoin.com | PDF Size: 0.1 MB
Rating: 4.5/5
Your Rating
You have already rated this document
PDF Document Cover - Towards Formal Verification of Password Generation Algorithms in Password Managers

1. Introduction

Password managers (PMs) are critical tools recommended by security experts to mitigate vulnerabilities associated with password authentication, such as weak passwords and password reuse. They enable the use of strong, unique passwords by handling storage and generation. However, user adoption is hampered by a lack of trust in these applications. This paper identifies the random password generation (RPG) feature as a key factor influencing both trust and usage. To address this, the authors propose the development and formal verification of a reference implementation for an RPG algorithm, aiming to provide a cryptographically sound and provably secure foundation that users and developers can trust.

The core problem is that while PMs promote security, their internal mechanisms—especially password generation—are often opaque black boxes. Without verifiable guarantees, users remain skeptical. The solution pathway involves using formal methods, specifically the EasyCrypt framework, to mathematically specify the algorithm and prove its correctness and security properties against a well-defined adversarial model.

2. Current Password Generation Algorithms

The paper surveys 15 password managers, focusing on three open-source, widely-used examples: Google Chrome's built-in manager, Bitwarden, and KeePass. The analysis reveals common patterns and critical differences in their approaches to generating random passwords.

2.1 Password Composition Policies

All studied PMs allow users to define policies constraining the structure of generated passwords. These policies typically include:

  • Password Length: Ranges from 1-200 (Chrome) to an extreme 1-30000 (KeePass).
  • Character Sets: Standard sets include lowercase letters, uppercase letters, numbers, and special characters. KeePass offers additional granularity with separate sets for brackets, space, minus, and underline.
  • Minimum/Maximum per Set: Chrome and Bitwarden allow defining minimum occurrences per character set. KeePass does not.
  • Exclusion of Ambiguous Characters: All three allow excluding visually similar characters (e.g., 'l', '1', 'O', '0') to reduce user error.
  • Custom Sets & Duplicates: KeePass uniquely allows defining custom character sets to include or exclude, and can remove duplicate characters from the generated password.

The variation in policy options highlights a lack of standardization, which complicates the creation of a universal, verifiable model.

2.2 Random Password Generation

The general algorithm involves randomly selecting characters from the allowed sets while respecting the policy constraints (length, minimums, maximums). The paper describes Chrome's algorithm in detail:

  1. First, generate characters for each set that has a defined minimum number of occurrences.
  2. Then, generate remaining characters by randomly selecting from the union of all sets that have not yet reached their maximum allowed occurrences.
  3. Finally, apply a random permutation to the string of generated characters.

This process, while seemingly straightforward, must be implemented carefully to ensure true randomness and unbiased distribution across the entire policy-constrained password space. Subtle biases in selection or permutation can weaken the resulting passwords.

3. Proposed Formally Verified Implementation

The paper's central contribution is the proposal to build a reference RPG implementation with machine-checked proofs of its properties.

3.1 Formal Specification

The first step is to create a precise mathematical specification of the password generation algorithm within EasyCrypt. This specification defines:

  • Inputs: Policy parameters (length $L$, character sets $S_1, S_2, ..., S_n$, minimums $min_i$, maximums $max_i$).
  • Output: A password string $p$ of length $L$.
  • Preconditions: The policy must be consistent (e.g., $\sum min_i \leq L$).
  • Postconditions (Functional Correctness): The output $p$ must satisfy all policy constraints. Formally, $\forall i, min_i \leq count(p, S_i) \leq max_i$, where $count$ tallies characters from set $S_i$ in $p$.

3.2 Security Properties and Proofs

Beyond correctness, the implementation must be proven secure. The paper adopts a game-based cryptographic proof approach. The key security property is uniform random sampling from the set of all passwords satisfying the given policy.

This is formalized as a security game where an adversary tries to distinguish between a password generated by the real algorithm and a password sampled uniformly from the valid password space. The proof demonstrates that no efficient adversary can win this game with probability significantly better than random guessing (1/2). This ensures the algorithm does not introduce predictable patterns or biases.

The proof would leverage EasyCrypt's libraries for reasoning about probabilistic computations and random sampling.

4. Experimental Results & Analysis

While the PDF provided is a preliminary work and does not contain full experimental results, it sets the stage for them. The proposed verification would yield the following tangible outcomes:

  • Verification Report: A machine-generated proof certificate from EasyCrypt, confirming the algorithm's code adheres to its formal specification and security theorems.
  • Comparative Analysis: The verified algorithm could be compared against the existing implementations in Chrome, Bitwarden, and KeePass. Tests would involve generating large batches of passwords (e.g., 1 million) under identical policies and statistically analyzing the distribution.
  • Metric: A key metric would be the Kullback-Leibler (KL) Divergence or Chi-squared test between the empirical distribution of generated passwords and the theoretical uniform distribution over the policy-defined space. A formally verified algorithm should show a divergence statistically indistinguishable from zero, while unverified implementations might reveal subtle biases.
  • Chart Description: A bar chart comparing the entropy (in bits) of the generated password distribution for each PM's algorithm against the theoretical maximum entropy for the given policy. The verified reference implementation's bar should align perfectly with the "Theoretical Max" bar.

5. Technical Details & Mathematical Framework

The formal verification relies on precise mathematical modeling. Let's define the core concepts:

Password Space: Given a policy with length $L$ and allowed character sets $S_1, ..., S_n$, the total set of compliant passwords $\mathcal{P}$ is a subset of the Cartesian product $\mathcal{C}^L$, where $\mathcal{C} = \bigcup_{i=1}^n S_i$. The size of $\mathcal{P}$ is constrained by the minimum and maximum rules.

Uniform Distribution: The security goal is for the algorithm $\mathcal{A}$ to implement a function indistinguishable from a true uniform sampler $\mathcal{U}_{\mathcal{P}}$. For any password $p \in \mathcal{P}$, the probability should be: $$\Pr[\mathcal{A}(policy) = p] = \frac{1}{|\mathcal{P}|}$$ where $|\mathcal{P}|$ is the cardinality of the valid password set.

Game-Based Proof Sketch: The security is framed as a "Real-or-Random" (RoR) game.

  1. The challenger flips a secret coin $b \xleftarrow{\$} \{0,1\}$.
  2. The adversary $\mathcal{D}$ can query the challenger with password policies.
  3. If $b=0$ (Real), the challenger runs the actual algorithm $\mathcal{A}$.
  4. If $b=1$ (Random), the challenger samples uniformly from $\mathcal{P}$ using $\mathcal{U}_{\mathcal{P}}$.
  5. $\mathcal{D}$ outputs a guess $b'$.
The advantage of the adversary is defined as: $$\mathbf{Adv}^{ror}_{\mathcal{A},\mathcal{D}}(\lambda) = \left| \Pr[b' = b] - \frac{1}{2} \right|$$ The proof demonstrates that for all probabilistic polynomial-time adversaries $\mathcal{D}$, this advantage is negligible in the security parameter $\lambda$ (related to the randomness source).

6. Analysis Framework: A Non-Code Example

Since the PDF does not include actual code, here is a conceptual analysis framework for evaluating an RPG algorithm, presented as a checklist:

Case: Assessing "PM-X"s password generator.

Step 1 - Policy Deconstruction: Map PM-X's user interface options (checkboxes, sliders) to a formal policy tuple: (L=12, Sets={Lower, Upper, Digits}, min_Lower=1, min_Upper=1, min_Digits=1, exclude={'l','O','0'}).

Step 2 - Algorithmic Transparency: Can the algorithm's steps (like Chrome's 3-step process) be clearly described from documentation or source code? If not, it fails the transparency test.

Step 3 - Entropy Calculation: Calculate the theoretical maximum entropy for the policy: $log_2(|\mathcal{P}|)$ bits. For the above policy, $|\mathcal{P}|$ is the number of 12-character strings from an alphabet (e.g., 70 characters after exclusions) meeting the minimum constraints. This is the security benchmark.

Step 4 - Statistical Test Design: Design an experiment to generate $N$ passwords and test for uniform distribution. Use the Chi-squared test across the entire password space or a cleverly sampled subset.

Step 5 - Bias Detection: Look for specific biases: Are certain character positions less random? Does the algorithm for fulfilling minimums reduce randomness for the remaining slots?

This framework provides a structured, code-free methodology to critically examine any RPG, aligning with the paper's goal of moving from obscurity to verifiable analysis.

7. Future Applications & Research Directions

The work opens several promising avenues:

  • Standardization: A formally verified RPG could become a standard component (like a library) integrated into PMs across the industry, similar to how libsodium provides verified cryptographic primitives.
  • Browser and OS Integration: Operating systems (e.g., Windows Hello, macOS Keychain) and browsers could adopt the verified generator for their built-in password suggestion features, raising the baseline security for all users.
  • Hardware-Backed Generation: The verified algorithm could be implemented in secure hardware elements (TPM, Secure Enclave) for generation that is both physically and logically secure.
  • Post-Quantum Considerations: Future research could explore password generation policies that produce passwords resistant to both classical and quantum brute-force attacks, with formal proofs adapting to new threat models.
  • Usability-Security Trade-off Verification: Extend the formal model to prove properties about the memorability or typability of generated passwords, bridging the gap between pure cryptography and human-computer interaction.
  • Automated Policy Analysis: Develop tools that use the formal model to automatically analyze a user-defined policy and report its effective entropy and any unintended constraints that weaken the password space.

8. References

  1. Grilo, M., Ferreira, J. F., & Almeida, J. B. (2021). Towards Formal Verification of Password Generation Algorithms used in Password Managers. arXiv preprint arXiv:2106.03626.
  2. Bellare, M., & Rogaway, P. (2006). The security of triple encryption and a framework for code-based game-playing proofs. In Advances in Cryptology-EUROCRYPT 2006 (pp. 409-426). Springer.
  3. Barthe, G., Dupressoir, F., Grégoire, B., Kunz, C., Schmidt, B., & Strub, P. Y. (2016). EasyCrypt: A computer-aided cryptography toolset. Lecture Notes in Computer Science, 9573, 3-32.
  4. National Institute of Standards and Technology (NIST). (2020). Digital Identity Guidelines: Authentication and Lifecycle Management (SP 800-63B).
  5. Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423. (Foundational work on entropy and information theory).
  6. Florêncio, D., Herley, C., & Van Oorschot, P. C. (2014). An administrator's guide to internet password research. In 28th Large Installation System Administration Conference (LISA14) (pp. 44-61).

9. Analyst's Perspective: Core Insight & Critique

Core Insight

This paper correctly identifies a critical, yet often overlooked, vulnerability in the cybersecurity supply chain: the unverified randomness at the heart of password managers. The real insight isn't that password generation is complex—it's that the trust model for a foundational security tool is broken. Users are asked to trust a black box with their digital keys. The authors' proposal to apply formal verification, a technique more common in hardware design and critical aviation software (like DO-178C certified systems), to a consumer security primitive is both ambitious and necessary. It's an attempt to transplant the rigor of the SRI International or INRIA research lab into the often-sloppy world of application security.

Logical Flow

The argument flows logically from a well-established problem (user distrust of PMs) to a specific, technical root cause (opaque password generation), and then to a concrete, high-assurance solution pathway (formal specification and proof). The survey of existing PMs is not just academic; it effectively demonstrates the wild inconsistency in implementations, making the case for a standard, verified reference. The pivot to EasyCrypt and game-based proofs is appropriate, as this framework, developed by leading institutions in formal crypto, is precisely designed for this class of problem. The flow mirrors the methodology of seminal works in verified cryptography, such as the verification of the HMAC-DRBG generator.

Strengths & Flaws

Strengths: The paper's greatest strength is its focus on a high-impact, tractable problem. It doesn't try to verify an entire PM; it goes for the cryptographic core. Using open-source PMs for analysis grounds the research in reality. The choice of game-based proofs is the industry standard for modern crypto analysis.

Critical Flaws & Gaps: The paper, as presented, is more of a compelling proposal than a completed study. The most glaring omission is the actual EasyCrypt code and completed proofs. Without these, the claim remains theoretical. Furthermore, it underplays the policy complexity problem. The formal specification must handle every edge case of user-defined policies (e.g., conflicting min/max, custom character sets). This could lead to a specification as complex as the implementation, a known challenge in formal methods. It also sidesteps the real-world entropy source—the algorithm is only as good as the system's CSPRNG (e.g., /dev/urandom). A verified algorithm fed weak randomness is still weak. The paper would be stronger by explicitly bounding its guarantees based on a perfect entropy source assumption.

Actionable Insights

1. For PM Developers: Immediately open-source your password generation code and subject it to third-party audit. Begin modeling your algorithm, even informally, to identify potential biases. Consider integrating a verified library like the one this research aims to produce.

2. For Security Auditors: Add "RPG Algorithm Analysis" to your standard PM audit checklist. Use the statistical framework outlined in Section 6 to test for distributional biases in client outputs.

3. For Standards Bodies (e.g., NIST, FIDO Alliance): Initiate a working group to define a standard API and set of security requirements for password generators, paving the way for certified implementations. This mirrors the successful standardization of cryptographic algorithms.

4. For Researchers: This work is a perfect starting point. The next step is to complete the EasyCrypt implementation and proofs. A subsequent, crucial phase is to develop a testing harness that can take the verified code and fuzz it against the real-world code of Chrome, Bitwarden, etc., to find concrete discrepancies and vulnerabilities, moving from theory to practical impact.

In conclusion, this paper shines a necessary light on a dark corner of our security infrastructure. While incomplete, its direction is correct. The password manager industry has matured beyond the "just trust us" phase; it's time for verifiable, mathematical trust. The success of this research could set a new precedent for how we build all client-side security software.