Select Language

Security Evaluation of Browser-Based Password Managers: Generation, Storage, and Autofill

A comprehensive security analysis of 13 popular password managers, evaluating password generation randomness, storage security, and autofill vulnerabilities.
computationalcoin.com | PDF Size: 1.0 MB
Rating: 4.5/5
Your Rating
You have already rated this document
PDF Document Cover - Security Evaluation of Browser-Based Password Managers: Generation, Storage, and Autofill

1. Introduction

Password-based authentication remains the dominant method for web authentication despite its well-documented security challenges. Users face cognitive burdens when managing multiple strong passwords, leading to password reuse and weak password creation. Password managers offer a potential solution by generating, storing, and autofilling passwords. However, previous research has identified significant vulnerabilities in browser-based password managers. This study provides an updated security evaluation of thirteen popular password managers five years after prior assessments, examining all three stages of the password manager lifecycle: generation, storage, and autofill.

2. Methodology & Scope

The evaluation covers thirteen password managers, including five browser extensions, six browser-integrated managers, and two desktop clients for comparison. The analysis replicates and expands upon previous work by Li et al. (2014), Silver et al. (2014), and Stock & Johns (2014). The methodology involves:

  • Generating and analyzing 147 million passwords for randomness and strength
  • Examining storage mechanisms for encryption and metadata protection
  • Testing autofill features against clickjacking and XSS attacks
  • Assessing default security configurations

3. Password Generation Analysis

This section presents the first comprehensive analysis of password generation algorithms in password managers.

3.1. Randomness Evaluation

The study evaluated the randomness of generated passwords using statistical tests including chi-square tests for character distribution and entropy calculations. The password entropy $H$ for a password of length $L$ with a character set size $N$ is calculated as: $H = L \cdot \log_2(N)$. For a truly random 12-character password using 94 possible characters (letters, numbers, symbols), the entropy would be $H = 12 \cdot \log_2(94) \approx 78.5$ bits.

3.2. Character Distribution Analysis

Analysis revealed non-random character distributions in several password managers. Some generators showed bias toward certain character classes or positions within the password string. For example, one manager consistently placed special characters in predictable positions, reducing effective entropy.

3.3. Guessing Attack Vulnerability

The research found that shorter generated passwords (under 10 characters) were vulnerable to online guessing attacks, while passwords under 18 characters were susceptible to offline attacks. This contradicts the common assumption that password-manager-generated passwords are uniformly strong.

4. Password Storage Security

The evaluation of password storage mechanisms revealed both improvements and persistent vulnerabilities compared to five years prior.

4.1. Encryption & Metadata Protection

While most managers now encrypt password databases, several were found to store metadata (URLs, usernames, timestamps) in unencrypted form. This metadata leakage can provide attackers with valuable reconnaissance information even without decrypting the actual passwords.

4.2. Default Configuration Analysis

Several password managers were found to have insecure default settings, such as enabling autofill without user confirmation or storing passwords with weak encryption parameters. These defaults put users at risk who do not customize their security settings.

5. Autofill Mechanism Vulnerabilities

Autofill features, while convenient, introduce significant attack surfaces that were exploited in this evaluation.

5.1. Clickjacking Attacks

Multiple password managers were vulnerable to clickjacking attacks where malicious websites could trick users into revealing passwords through invisible overlays or carefully crafted UI elements. The attack success rate varied between managers from 15% to 85%.

5.2. Cross-Site Scripting (XSS) Risks

Unlike five years ago, most managers now have basic protections against simple XSS attacks. However, sophisticated XSS attacks combining multiple techniques could still bypass these protections in several managers.

6. Experimental Results & Findings

The evaluation produced several key findings across the 13 tested password managers:

Password Generation Issues

4 of 13 managers showed statistically significant non-random character distributions

Storage Vulnerabilities

7 managers stored metadata unencrypted, 3 had insecure default settings

Autofill Exploits

9 managers vulnerable to clickjacking, 4 vulnerable to advanced XSS attacks

Overall Improvement

60% reduction in critical vulnerabilities compared to 2014 evaluations

Chart Description: A bar chart would show vulnerability counts across three categories (Generation, Storage, Autofill) for each of the 13 password managers. The chart would clearly show which managers performed best and worst in each category, with color coding indicating severity levels.

7. Technical Analysis & Framework

Core Insight

The password manager industry has made measurable but insufficient progress. While the sheer volume of critical vulnerabilities has decreased since 2014, the nature of the remaining flaws is more insidious. We're no longer dealing with basic encryption failures but with subtle implementation bugs and poor default configurations that erode security at the margins. This creates a dangerous false sense of security among users who assume password managers are "set and forget" solutions.

Logical Flow

The paper follows a compelling narrative arc: establish the persistent problem of password security, position password managers as the theoretical solution, systematically dismantle this assumption through empirical testing, and conclude with actionable improvements. The methodology is sound—replicating past studies creates a valuable longitudinal dataset, while the novel focus on password generation addresses a critical gap. However, the study's external validity is limited by its snapshot approach; security is a moving target, and today's patch could create tomorrow's vulnerability.

Strengths & Flaws

Strengths: The scale is impressive—147 million generated passwords represents serious computational effort. The three-pillar framework (generation, storage, autofill) is comprehensive and logically sound. The comparison to 2014 baselines provides crucial context about industry progress (or lack thereof).

Flaws: The paper curiously avoids naming the worst performers, opting for anonymized references. While understandable from a liability perspective, this undermines the study's practical utility for consumers. The analysis also lacks depth on root causes—why do these vulnerabilities persist? Is it resource constraints, architectural decisions, or market incentives?

Actionable Insights

1. For Users: Don't assume password manager-generated passwords are inherently strong. Verify length (minimum 18 characters for offline attack resistance) and consider manual review of character distribution. 2. For Developers: Implement proper randomness testing using established cryptographic libraries like NIST's Statistical Test Suite. Encrypt ALL metadata, not just passwords. 3. For Enterprises: Conduct regular third-party security assessments of password managers, focusing on the specific vulnerabilities outlined here. 4. For Researchers: Expand testing to mobile platforms and investigate the economic incentives that allow these vulnerabilities to persist.

Analysis Framework Example

Case Study: Evaluating Password Randomness

To assess password generation quality, researchers can implement the following evaluation framework without requiring access to proprietary source code:

  1. Sample Collection: Generate 10,000 passwords from each manager using default settings
  2. Entropy Calculation: Compute Shannon entropy $H = -\sum p_i \log_2 p_i$ for character distributions
  3. Statistical Testing: Apply chi-square test with null hypothesis $H_0$: characters are uniformly distributed
  4. Pattern Detection: Search for positional biases (e.g., special characters only at ends)
  5. Attack Simulation: Model guessing attacks using Markov chain techniques similar to those in Weir et al.'s "Password Cracking Using Probabilistic Context-Free Grammars"

This framework mirrors the approach used in the paper while being implementable by independent researchers or auditing organizations.

8. Future Directions & Recommendations

Based on the findings, several future directions and recommendations emerge:

Technical Improvements

  • Implementation of formal verification for password generation algorithms
  • Development of standardized security APIs for password managers
  • Integration of hardware security keys for master password protection
  • Adoption of zero-knowledge architectures where the service provider cannot access user data

Research Opportunities

  • Longitudinal studies tracking specific password managers' security evolution
  • User behavior studies on password manager configuration and usage patterns
  • Economic analysis of security investment in password management companies
  • Cross-platform security comparisons (desktop vs. mobile vs. browser)

Industry Standards

  • Development of certification programs for password manager security
  • Standardized vulnerability disclosure processes specific to password managers
  • Industry-wide adoption of secure defaults (e.g., mandatory user confirmation for autofill)
  • Transparency reports detailing security testing methodologies and results

The future of password managers likely involves integration with emerging authentication standards like WebAuthn and passkeys, potentially reducing reliance on traditional passwords altogether. However, during this transition period, improving current password manager security remains critically important.

9. References

  1. Oesch, S., & Ruoti, S. (2020). That Was Then, This Is Now: A Security Evaluation of Password Generation, Storage, and Autofill in Browser-Based Password Managers. USENIX Security Symposium.
  2. Li, Z., He, W., Akhawe, D., & Song, D. (2014). The Emperor's New Password Manager: Security Analysis of Web-based Password Managers. USENIX Security Symposium.
  3. Silver, D., Jana, S., Boneh, D., Chen, E., & Jackson, C. (2014). Password Managers: Attacks and Defenses. USENIX Security Symposium.
  4. Stock, B., & Johns, M. (2014). Protecting the Intranet Against "JavaScript Malware" and Related Attacks. NDSS Symposium.
  5. Weir, M., Aggarwal, S., Medeiros, B., & Glodek, B. (2009). Password Cracking Using Probabilistic Context-Free Grammars. IEEE Symposium on Security and Privacy.
  6. Herley, C. (2009). So Long, And No Thanks for the Externalities: The Rational Rejection of Security Advice by Users. NSPW.
  7. NIST. (2017). Digital Identity Guidelines: Authentication and Lifecycle Management. NIST Special Publication 800-63B.
  8. Fahl, S., Harbach, M., Acar, Y., & Smith, M. (2013). On the Ecological Validity of a Password Study. SOUPS.
  9. Goodin, D. (2019). The sorry state of password managers—and what should be done about it. Ars Technica.
  10. OWASP. (2021). Password Storage Cheat Sheet. OWASP Foundation.