Just because you're paranoid...

Safe storage of sensitive user data

Lennon Day-Reynolds

Hats ready, everyone…

http://flickr.com/photos/xt0ph3r/507318265/

photo by r3v || cls (cc-attrib-sharealike)

Background

Passwords aren’t enough—you need some way for users to prove their identity during registration, password resets, etc.

Identity information can be just as valuable as a password for an attacker.

...but no one is really interested in my users’ data, right?

Denial is not a river in Africa

http://flickr.com/photos/benshepherd/535970250/

photo by Ben Shepherd (cc-attrib-noncom-sharealike)

They are out to get you

If you’re keeping passwords, security questions, account numbers, or any other sensitive data in plaintext, you (and your users) are vulnerable.

In fact, you’re vulnerable to an inside job or social hacking even if you’re using normal, symmetric crypto.

So, what can you do?

One-way hashing to the rescue

We can generate a hash digest from any input string.

Hashing passwords is standard practice…you’re all using at least MD5 for your passwords, right?

Works for any field where we only need to test for an exact match, not view the orginal plaintext.

You can also use a digest as a lookup key to protect user identity in other tables.

How does it work?

We’re just putting a salted digest of each object (string, tuple, etc.) into a simple Hash-like store.

At any later point, we can test to see if that value exists.

If it does, we know the provided values (password, PIN code, etc.) were correct.

Without all the components (salt, tuple/object values) you can’t find the record.

Toy implementation: KeySpace


class KeySpace
  # `salt' is a key used to randomize hashes so
  # dictionary attacks won't work
  # `store' can be any Hash-like object
  def initialize(salt, store=Hash.new)
    @salt = OpenSSL::Digest::SHA.digest(salt)
    @store = store
  end

  def exists?(token)
    @store.has_key?(secure_hash(token))
  end

  def store(token)
    @store[secure_hash(token)] = true
  end
end

KeySpace, cont.


class KeySpace
  #...

  # other hash algorithms could be used here, 
  # but basic SHA is probably good enough
  def secure_hash(obj)
    obj_str = Marshal.dump(obj)
    OpenSSL::Digest::SHA.hexdigest(obj_str + @salt)
  end
end

Basic KeySpace tests


context "KeySpace instance" do
  setup do 
    @space = KeySpace.new('some_random_salt')
    @q = ['rcoder', 'acct_num', '123-456']
    @space.store(@q)
  end

  specify "can find correct values" do
    @space.exists?(@q).should.be true
  end

  specify "cannot find incorrect values" do
    @space.exists?(['other']).should.be false
  end
end

KeySpace + secondary lookup


context "KeySpace with secondary storage" do
  setup do
    @space = KeySpace.new('my_app_salt')
    ActiveRecord::Base.establish_connection(...)
  end

  specify "supports user lookup" do
    sec_hash = @space.secure_hash(['user', 'pass'])
    @space.exists?(sec_hash).should.be true

    user = SiteUser.find_by_secure_hash(sec_hash)
    user.should.not.be nil
  end
end

The internal data is secure

Even if an attacker has the ability to read your entire database, it won’t give them much to work with:


  irb> ks = KeySpace.new('some_random_salt')
  => #<KeySpace:0x101c920 ...>
  irb> ks.store(
    ['rcoder', 'acct_num', '123-456']
  )
  => true
  irb> ks.instance_eval { @store.keys }
  => ["9e18107895602485b711a5a67bf6670d78a75296"]

What’s it good for?

Without the salt, your records are completely safe; with it, a attacker still has to resort to dictionary search. (This includes `root’!)

Hint: provide the salt interactively at app startup for even more security.

Could easily be implemented as a model helper for your ActiveRecord objects, to avoid the explicit secondary lookup.

What isn’t it good for?

Obviously, if you need to inspect the values, you’re SOL.

Manual introspection of the database is pretty much useless, too—192-bit digests are bad mnemonics.

Expiration can be hard, since we don’t know whose records are associated with what digest. (On the other hand, it’s only a few bytes per record, so you can probably just leave old records lying around.)

Better safe than sorry

http://flickr.com/photos/robnwatkins/397488557/

photo by – RobW – cc-attrib-noncom-sharealike

End

Presentation online at: http://rcoder.net/foscon07/

For more advanced techniques, check out Translucent Databases by Peter Wayner