My MiniLock Concerns
Or on the cyber-playing field
Disclaimer: Some of my concerns and proposed changes may be addressed and included into the miniLock code before the official release, greatly improving security. I will post a new entry once the merge has happened to describe the changes and how they make miniLock much more secure. Until then, please take the below with a large grain of salt (pun intended).
Prior to this years HOPE conference, a few articles appeared touting Nadim Kobeissi's new project, miniLock and him and I got into a heated debate regarding some issues that I, as a certified non-cryptographer, was concerned about in his design (at least what I could gather from it from the pre-release Wired article). The essence of my concerns boiled down to the fact that in miniLock, Nadim is using the human user as the sole source for entropy in the ECC key generation. At the time of the debate, Nadim suggested I wait until the big unveiling at HOPE to pass judgement as he had some research into secure human-generated pass-phrases that he was going to share.
HOPE was last weekend (I didn't go) and finally the design specification has been posted along with the audited code. It appears that my initial concerns were warranted, each key is generated directly from the user's pass-phrase with no salt, thus leaving the true amount of entropy in the key generation process highly unknown.
For those of you who are not aware, generally when a key is generated, a strong random number generator is used to create a uniformly random entropy source of a known amount (e.g. 128, or 256-bits), this becomes the basis for the private key (and a public key in an asymmetric cryptography system) and the user's password is used to lock the key against unauthorized use. What this means is the user's password (assuming the key files are not stolen) is not related to the security of the messages nor data being encrypted by the keys. It would require a targeted attack to recover a user's keys, then crack the user's password for that key.
With miniLock, this is not the case, the user's password is the key (run through a deterministic derivation algorithm). It also allows the generation of a rainbow table, if you generate probable pass-phrases (5-8 English words concatenated and slightly permutated), you will have a host of likely miniLock IDs, then when a miniLock file is intercepted, if it is to one of the IDs in your table, you can decrypt. Rainbow tables are generally computed in a distributed fashion in a similar fashion to SETI@Home where individual computers (or cloud instances) work in unison to cover a high percentage of the key space.
Humans are not a strong source of randomness. There are countless patterns that we fall into, because at the root of it, we have evolved to be lazy and conserve energy (a great book on this). What the magic of strong cryptography has allowed for is the user to largely be removed from their own security posture (excepting targeted attacks). SSL and TLS allow for every user to transmit sensitive information across the network without being an expert in security, yes, they will probably have swarms of malware on the endpoint (which miniLock will not protect against either), but they are mostly not in control of their own security destiny (which is a good thing).
When you force the user to determine his or her own security destiny, they will only make the game easier for attackers. miniLock uses the zxcvbn "entropy" estimator to determine how secure the user is being, and either allow or disallow that pass-phrase. Aside from some obvious implementation/design issues, there is a fundamental issue with this approach: as password cracking software improves, the security of every message and miniLock ID decreases. If a smarter implementation of a password cracker (e.g. combine 5-8 words from the 5,000 most common) can crack your key faster than a "dumb" brute-force cracker, then you have an issue. 128-bits of uniform entropy will take the same amount of time (probabilistically) to crack with any password cracker (assuming a fixed number of keys/s) as it is a constant information-theoretic entropy, not a back-of-the-envelope calculation making assumptions about the adversary.
In his 2011 SchmooCon keynote, Mudge talks about the "cyber-playing field" and notes that traditional attack/defense strategies yield an exponential advantage to the attacker. This is an interesting example of that relationship. Nadim and the author(s) of zxcvbn have to publish their software and assumptions in order to gain trust by the community, the attacker has free access to that information, but is not required to share his or her cracking techniques. At THREADS last fall, there was a fascinating talk on how humans respond to password requirements and restrictions. I would wager that a similar analysis could be performed on the zxcvbn algorithm in order to see the mental "crutches" used by our lazy brains in order to get around a restriction.
In conclusion, the goal of an encryption tool that is highly usable is a lofty goal that we as a community should strive for, but not at the expense of privacy and security. In a following post, I will outline a similarly usable system architecture that does not have the same security flaws. The current trend towards higher surveillance of internet users should be met with stronger privacy protections, greater oversight, transparency and education of the user base; miniLock fulfills none of these and other than a neat toy, should not be used for strong privacy nor protection of freedom-fighting activities in oppressive regimes.
Cyber-security Philosopher and Boffin