Technology Preview: Signal Private Group System
jimio on 09 Dec 2019
Groups are inherently social, and Signal is a social app. Whether you’re planning a surprise party, discussing last night’s book club meeting, exchanging photos with your family, or organizing something important, group messaging has always been a key feature of Signal.
Signal provides private groups: the Signal service has no record of your group memberships, group titles, group avatars, or group attributes. We’ve been working on new private group technology that will enable group administrators and access control, improve group scalability, and set the stage for a much richer group experience – all while maintaining Signal’s unique group security and privacy properties. We’re moving into the future while keeping what we loved about the past.
Retrospective: Groups without groupthink
You can’t send a message to everyone in a group unless you know who is in it. Group members also need to be able to retrieve the latest group state in order to render it: group name, group image, group membership, as well as any other optional elements (such as a pinned group welcome message).
This data is sensitive, but the traditional approach to managing group state is storing it in a plaintext database on a server. This makes it simple for clients to retrieve the latest group state, and for the server to enforce access control and consistency through a basic API, so it’s what essentially everyone else does.
──────────────────────────────────────────────────────────────────────────────────────────── | id | group_name | avatar | members | welcome_message | ──────────────────────────────────────────────────────────────────────────────────────────── | 765 | "Surprise party!" | "party-hat.jpg" | sofia (admin), wei, hugo | "Don't tell Alice!" | | 766 | "Book Club Ideas" | "library.png" | jakob, saanvi (admin) | "Les Mis again?" | | 767 | "Finals Week" | "crying.gif" | bob, lucia (admin), leo | "Only 4 days left" | ────────────────────────────────────────────────────────────────────────────────────────────
The obvious downside is that this enables the server (and thus anyone who compromises or subpoenas the server) to see all of this personal information. The server knows everything about all groups, and it can also surreptitiously modify group membership and other group attributes. This isn’t what we wanted for Signal.
The group conversation scheme that we introduced in 2014 was built on the existing pairwise encrypted channels that are already used in one-on-one Signal conversations. Clients send group messages to each other tagged with a Group ID (a random 128-bit secret that cannot be guessed), and they also exchange group state updates – such as the group’s name, attributes, and membership – via the same method. Clients never tell the service which messages are group messages or individual messages, or who is in the group. Instead, clients tell each other what they need to know.
This approach hasn’t been perfect. If two group members try to update the group state at the same time, this can create a race condition as these messages cross paths. It’s unclear which update should be treated as authoritative, and this can lead to divergence in the members’ view of the current state. It also largely precludes role-based access control: every group member in Signal has the same permissions because what you learn about the group is only what other people tell you. Their group state may not be accurate, or they could claim to be a role they really aren’t.
Some readers might recognize these as problems inherent to distributed systems. They require complex consensus protocols like Paxos or Raft to solve robustly, which are unfortunately impractical to implement across inconsistently connected asynchronous clients running on mobile phones.
So even though we would never accept the inadequate security associated with storing huge amounts of sensitive group data server-side in plaintext, we have looked longingly at the simplicity of a single source of truth and how much more would be possible with it. Ideally, we could somehow create the best of both worlds: a single source of truth that clients can easily reference, but privacy preserving.
Return of the MAC
Let’s try to build a system that stores group information privately, but canonicalized on a server. Clients in a group could share a symmetric key and use it to encrypt group information so that it can be stored on the service, but in a way that is completely opaque to the service.
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── | id | group_name | avatar | members | welcome_message | ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── | 765 | "E2jP6c8LB8ESLdTNy/tingqAZFj2so8s=" | encrypted-file | "2wb5fLpn1yun0acrEbVifXAf566ONu4nwhL2/GDntYdxxn30pejHj" | "uJjNnyiPnMmfFtXpMkhFA/v9NQQfllfSJ7dYZWBkfTJvIw/jpy4+s" | | 766 | "vNea+GD5bETfL4YS2g/Flhz2lrikq6vY=" | encrypted-file | "Y+bt/qjxu8pacBVYyBGJAoVPlIH9T4LeKwoCqf8KeOMMG5s5mU+mY" | "hKSQ2HnDoHvpwuxBCaCuvJXwjIq+U9cQ=" | | 767 | "dil09QiiGNlt5TajrmfqVs/6VZnk6eSE=" | encrypted-file | "+v2Hh1Sh0X64kpAyhXvsn7tuhrNuOpiq1L6VpMV4jvnqzzf+taKdE" | "L+88zl5AXn+5aY3k/CXDTTLyhJwRnAuojqAksXtNmoxyCdAi/PeJS" | ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
But there’s a snag: the service won’t be able to authenticate users, enforce who is allowed to fetch and modify group entries, or validate their contents if the service is unable to read the data. The inherent contradiction is that the service needs to authenticate whether a membership record corresponds to the user making a request, but the user doesn’t want the service to know who they are.
This is the type of problem that anonymous credentials were invented to solve. With an anonymous credential scheme, the service could issue authentication credentials to clients. Those clients could later prove possession of a credential, as well as facts about attributes bound into the credential, without revealing anything else.
Historically, most anonymous credential schemes enable the credential issuer and credential verifier to be different parties, which is achieved by using complex and costly signature algorithms. These costs are one reason anonymous credentials have seen limited real-world use. However, in Signal’s case the issuer and the verifier would be the same party (the Signal service), which raises the possibility of using more efficient MAC-based keyed-verification anonymous credentials – a concept introduced several years ago by Melissa Chase, Sarah Meiklejohn, and Greg Zaverucha.
Unfortunately, existing efficient KVAC schemes (like the Chase-Meiklejohn-Zaverucha scheme which introduced the KVAC concept) don’t support efficient proofs that a credential attribute matches some encrypted plaintext. So we worked with Melissa Chase and Greg Zaverucha on an extension of this scheme that supports this property. Using these encryption-compatible KVACs, group members can be issued authentication credentials by the service for their user identity (UID), and can then authenticate by proving to the server they have an auth credential issued over the same user identity (UID) as some encrypted group membership entry, without revealing their UID or anything else.
A couple problems remain: (1) There’s a catch-22 here: clients have to prove their credential matches some ciphertext before they’re allowed to download the ciphertext. (2) Because encrypted entries do not reveal anything about the plaintext within, a malicious client could simply add the same UID over and over, potentially making themselves difficult to delete or confusing the server’s access control rules. Luckily, both problems have the same solution: if the encryption process is made deterministic so the same UID always encrypts to the same ciphertext within a group, then the client can recreate their ciphertext without fetching it, and servers can easily detect and reject duplicate entries.
Grouping it all together
Let’s take a look at this system in action. Suppose Alice has an AuthCredential for her UID, and a GroupMasterKey (only known by group members, not the server) for some particular group. The server stores an encrypted membership list for the group. Each entry in the membership list is an encryption of some UID with the GroupMasterKey.
To add Bob to the group, Alice must first prove to the server that she is allowed to make this change. Alice provides a zero-knowledge proof to the server that she possesses an AuthCredential matching some particular entry.
We call this a “presentation” of the AuthCredential, but it’s more complicated than just sending the AuthCredential; if Alice did that, the server would be able to correlate the received AuthCredential with the AuthCredential that it issued. Instead, Alice presents a randomized form of the credential and uses some “Schnorr” and “Fiat-Shamir” magic to prove a relationship to the ciphertext without revealing anything else; see the paper for details.
After Alice proves to the server that she matches some entry, she sends the server a new entry encrypting Bob’s UID. Alice also sends Bob the GroupMasterKey via an encrypted Signal message.
Now that Bob is a member of the group, he’d like to learn who’s in the group. He can prove he is a member using his AuthCredential, then download all the entries and decrypt them with the GroupMasterKey. If he has been granted the appropriate role, Bob could also add Charlie to the group, just like Alice added him.
You are invited to join this conversation
We encourage you to explore the latest draft of the paper that covers this topic in more detail, including some features we’ll delve into in later blog posts. Please share your input on the Signal Community Forum. Your feedback is valuable, and we are excited to begin implementing these group enhancements in Signal over the coming months.
Thank you to Melissa Chase and Greg Zaverucha from Microsoft Research for collaborating on this research, Trevor Perrin for doing all of the heavy lifting on the Signal side, and additional group gratitude to Dan Boneh, Isis Lovecruft, Henry de Valence, and Jan Camenisch for discussions that deepened our understanding of this problem space.