[NKP-0016] Client side pub/sub permission control

yilun · January 19, 2020, 4:58am

This NKP proposes a pub/sub permission control mechanism that can be implemented purely on the client side.

Design Goal

We want to achieve the following goal:

each group has an owner
owner decides who can join the group (everyone, whitelist, or blacklist)
members can be removed by owner from the group
only group members can send message to the group

Protocol

Join Group

When client wants to join group name owned by user whose public key is pubkey, client subscribes to topic name.pubkey.

Set/Update Permission

When owner wants to set or update who can join the group, he subscribes with a certain family of identifiers, e.g. __0__.__permission__, __1__.__permission__, __2__.__permission__ …, and put permission control information in the subscribe metadata. He can use { "accept": ["*"] } to allow everyone to join, or { "accept": [{"addr": "addr1"}, {"pubkey": "pk1"}, ...] } to accept a list of address or pubkey (whitelist), or { "reject": [{"addr": "addr1"}, {"pubkey": "pk1"}, ...] } to reject a list of address or pubkey (blacklist). Because of metadata size limit, he might use multiple identifiers to store the permission control list.

Get Permission

When anyone wants to get permission control information of a group name owned by pubkey, he first gets all subscriptions of topic name.pubkey, filter out all subscriptions __0__.__permission__.pubkey, __1__.__permission__.pubkey, __2__.__permission__.pubkey …, and merge metadata of these subscriptions. Then he can compute the permission control information of the group.

Publish Message

When a client wants to publish a message to the group name owned by pubkey, he gets both subscribers of the topic name.pubkey and permitted users of the group. Then he sends message to addresses that are both subscribed to the topic and are permitted to the group.

Receive Message

When a client receives a message from a group, he checks if the sender is in the permitted users of the group. If not, client will discard the message.

Attack Resistant

Malicious clients can try to subscribe to the topic or publish message to group members, but honest clients, as long as follow the protocol, will not send any message to or handle message from malicious clients.

Extension

The above basic protocol can be further extended by storing more information in owner’s subscribe metadata. For example, one can have different permission control for sender and receiver, or add other users as admin, etc.

lynn · November 19, 2019, 7:20pm

Rather than topic name topic-name.pubkey, what if the owner was the person that was subscribed to the topic for the longest time in sequence?

yilun · November 20, 2019, 12:39am

Rather than topic name topic-name.pubkey , what if the owner was the person that was subscribed to the topic for the longest time in sequence?

Because subscription history is not stored in global state to prevent state exploding, getting this information requires to go through all block history, which is not possible for nodes with pruned ledger and very inefficient for those who have the full history.

Also, if we choose this way, people will start to subscribe to all common topic names just like what happened to the name service before

lynn · November 21, 2019, 12:56pm

Alright.

I had the idea that once the name service is back online, people could claim channels by having their name to match the topic. Maybe this topic.pubkey is a better idea, so you can’t steal admin rights, but UX-wise, I can see it becoming quite difficult for d-chat, since there will be multiple of the same looking topics.

How about the transactions that this will create? Is the 4-free-tx-per-block thing still a thing, or was that removed when spam stopped? edit: on 2nd thought, it is probably too early to worry about tx count since 99% blocks are empty

In the case of {accept: ['1','2'...]}, should the subscribers even need to subscribe, but instead just send a “please add me” message to the channel owner?

yilun · November 21, 2019, 10:14pm

I can see it becoming quite difficult for d-chat, since there will be multiple of the same looking topics

I was thinking that maybe group name can be treated the same as username: display name + the first few hex chars of the pk. But it might not be necessary as lots of IM software I know (Telegram, Wechat, etc) allow multiple groups with the same name and it seems to be not an issue.

Also, in the future it’s probably useful to add avatar function for both group and people (which can be put in the sub metadata) so different group can be differentiated even easier.

In the case of {accept: ['1','2'...]} , should the subscribers even need to subscribe, but instead just send a “please add me” message to the channel owner?

I think we need both. Subscribe means subscriber has the intention to join the group. Without subscribe action, the group owner can add random people to the group and send spams, just like Telegram now.

lynn · December 14, 2019, 1:00pm

Might be worth it to make them objects: accept: [{addr: 'addr1'}, {addr: 'addr2'}], etc.

Because that way we can more easily modify it to work for timeouts, for example:
reject: [{ addr:'addr1', expires: blockNo123 }]

And I think instead of accepting/rejecting ‘identifier.pubkey’ combos, we might as well do it just for pubkey, like accept: [{ pk: '43243...' }].

yilun · December 14, 2019, 6:39pm

Might be worth it to make them objects: accept: [{addr: 'addr1'}, {addr: 'addr2'}] , etc.

Because that way we can more easily modify it to work for timeouts, for example:
reject: [{ addr:'addr1', expires: blockNo123 }]

That sounds like a good idea.

And I think instead of accepting/rejecting ‘identifier.pubkey’ combos, we might as well do it just for pubkey, like accept: [{ pk: '43243...' }] .

nshd uses the protocol that if an address with pk is specified, then only that address is allowed. If a public key is specified, then all address with that pk is allowed. We might be able to use that also, or probably use prefix wildcard like tls certificate.

lynn · January 18, 2020, 4:12pm

I don’t know about nMobile, but I decided to add admin user as always having permission, for UX reasons.

When you join your permissioned channel where you’re admin, you still obviously have to subscribe to said channel, and since you wouldn’t have permissions to post there before you add yourself to the accept list, you have to subscribe again with the __permissions__ identifier. This probably happens quite quickly, so while the original subscription is still in mempool, creating the __permissions__ subscription shoves that one aside, and now you’re effectively unsubscribed.

Side note:
Does nMobile use [{addr: 'xyz'}] or ['xyz'] for the lists?

Here is an initial version https://gitlab.com/losnappas/nkn-permissioned-pubsub using the ['asdf', 'xyz'] style, but changing is easy enough.

yilun · January 19, 2020, 4:56am

I think your suggested [{addr: 'xyz'}] is more flexible so we can use that. The scheme in nMobile is free to change, so it’s not a problem

yilun · January 19, 2020, 4:59am

Also, what about using {"addr": "addr1"} to represent a specific address, and {"pubkey": "pk1"} to represent all address using this pubkey?

lynn · January 19, 2020, 2:08pm

Maybe {"addr": "pubkey", "pubkey": true} instead? That way it’ll be easier to compare between “addr” attributes only.

Or maybe it’s better to make it like this:

If we want to use the pubkey version of the rules, then {addr: 'pubkey'} and if not, but user has no identifier, we use {addr: '.pubkey'}? The dot being the difference.

yilun · January 19, 2020, 8:28pm

I’m a little bit concerned about the additional overhead. Each metadata is restricted to ~1k bytes, and ,"pubkey":true is using up 14 of them per subscriber. I think we should try to make per-subscriber overhead as small as possible and only add necessary fields.

lynn · January 20, 2020, 3:08pm

We made them objects so we could add fields like that.

Well, if you think that that’s going to run into space trouble, then how about we modify this up a bit.

{
  accept: {
   addr: ['addr1',...],
   pubkey: ['pk1', ...],
  },
  reject: {
   // as above
  }
}

Though it isn’t as lenient as the objects version, it should take less space, no?

Speaking of space, does making the fields like {a: {a:[],p:[]}, r: {...} }, where the keys are 1 char make sense, or is that not worth it?

lynn · January 20, 2020, 7:39pm

Going back to this, it still doesn’t solve everything that we approve the admin automatically, because admin still needs to be subscribed in order to receive messages.

If a usual user goes along the lines of

join your own, new channel
your friend joins
you add permission to your friend
- you just overwrote your own subscription tx

It’s kind of a pain to deal with this. Maybe we add the permissions into the original identifier, instead of the __n__.__permissions__ identifier.
If you then change your identifier, you’re essentially back where we started, which is subbing as 2 identifiers, but that’s quite an edge case.

So instead of __0__.__permissions__, use any identifier, and metadata like:

{
 permissions: {
  accept: {...allows},
  ban: {...bans},
 },
 ...unrelated_metadata,
}

yilun · January 21, 2020, 2:23am

We made them objects so we could add fields like that.

I agree. Actually I agree that we should make them objects so that we can add necessary fields later.

Actually IMO, {accept: {addr: xxx}, {pubkey: xxx}} should be almost identical as {accept: {addr: xxx, pubkey: false}, {addr: xxx, pubkey: true}} in terms of code complexity:

// permission is {addr: xxx} or {pubkey: xxx}
func checkAccept(addr, perm) {
  if (perm.pubkey) {
    // handle wildcard
    return addrToPubkey(addr) === perm.pubkey;
  } else if (perm.addr) {
    // handle wildcard
    return addr === perm.addr;
  }
  return false;
}

// permission is{addr: xxx, pubkey: false} or {addr: xxx, pubkey: true}
func checkAccept(addr, perm) {
  if (perm.pubkey) {
    // handle wildcard
    return addrToPubkey(addr) === perm.addr;
  } else if (perm.addr) {
    // handle wildcard
    return addr === perm.addr;
  }
  return false;
}

you just overwrote your own subscription tx

This should not happen usually because SDK will try to get nonce from txpool first. So you if you send 2 txns without waiting, they will not override, unless they are subscribing to the same topic with the same identifier. So separating permission identifier and admin’s own identifier has one advantage: he can subscribe and change permission at the same time without affecting each other.

Maybe we add the permissions into the original identifier, instead of the __n__.__permissions__ identifier.

Most importantly, __n__.__permissions__ is needed because of the meta size limit. Since each metadata needs to be <= 1024 bytes, each meta can only store up to 14 subscribers if we use hex representation. We can use more efficient representation (e.g. wallet address, short hash, or even bloom filter), but ~50 subscribers are probably the limit.

I think we can use a special identifier to store some group metadata, like what’s the range of n in __n__.__permissions__. IMO using a fixed special identifier is better than using owner’s identifier because:

owner’s identifier is not unique, so there might be a conflict. And if there is a priority, user needs to get all subscribers in order to find the effective one.
other people don’t know owner’s identifier in advance, so it’s possible to attack such group by sending massive subscribe txns and make it very inefficient to find the real owner. If this group is very important, then attacker definitely has a motivation to spend some txn fee and attack the group.

So my suggestion is that:

We still separate permission identifier and owner’s identifier, and owner needs to send 2 txn. Because they use different identifier, there won’t be a conflict or override.
We choose a fixed identifier to store some group metadata like range of n
For the fixed identifier, we can probably use __0__.__metadata__ or something else. The __0__ is added in case we want to make it multiple (e.g. due to meta size limit) in the future.

lynn · January 21, 2020, 7:33pm

I’ve been doing this, but the second tx always overrides the first if both transactions are to exist in the mempool at the same time.

A test:

gitlab.com

losnappas/nkn-permissioned-pubsub/blob/master/test/test2.js

/**
 * Tests having two subscriptions from one wallet in mempool.
 * Quite unrelated to the lib.
 */
import nknWallet from 'nkn-wallet'
import Nkn from 'nkn-multiclient'
import test from 'tape'

const delay = 10000
const client = new Nkn({
	seed: '2bc5501d131696429264eb7286c44a29dd44dd66834d9471bd8b0eb875a1edb0',
	seedRpcServerAddr: 'http://devnet-seed-0001.nkn.org:30003'
})
nknWallet.configure({
	rpcAddr: 'http://devnet-seed-0001.nkn.org:30003'
})
const wallet = nknWallet.restoreWalletBySeed(
	'2bc5501d131696429264eb7286c44a29dd44dd66834d9471bd8b0eb875a1edb0',
	'x'
)

This file has been truncated. show original

lynn · January 21, 2020, 8:02pm

I don’t see how this makes sense. getSubscription doesn’t return from mempool, which makes things slow, but getSubscribers does. Since I’ve been using the latter to get permissions, you have to parse all the subscribers and find the __${n}__.__permission__ identifiers anyway.

Plus, there’s 6k limit to subs per topic as far as I know, so capping that is far easier than trying to attack the group by adding a couple of subs to parse.

Do you think that we should use getSubscription instead anyways? It is far too slow, in my opinion.

yilun · January 21, 2020, 10:44pm

I’ve been doing this, but the second tx always overrides the first if both transactions are to exist in the mempool at the same time.

Oh you are right! The nonce in txpool is not used by default in js sdk getNonce function (but is used by default in go sdk ). We just published a new version of nkn-wallet-js (0.4.9) that will use nonce in txpool by default. The following simple script will send two subscribe txn with different nonce:

(async function () {
  console.log(await wallet.subscribe('topic', 10, 'identifier1'));
  console.log(await wallet.subscribe('topic', 10, 'identifier2'));
})()

Resulting blocks can be found at https://explorer.nknx.org/blocks/790396

yilun · January 21, 2020, 11:16pm

there’s 6k limit to subs per topic as far as I know

It’s actually 100k defined at https://github.com/nknorg/nkn/blob/master/util/config/config.go#L114

getSubscription doesn’t return from mempool, which makes things slow

mempool does not affect speed. It only affects whether to return additional result that has not been packed into a block yet. For results that are already in block, mempool will not have any affect.

Also for permissions, not including results in mempool sound reasonable because it’s safer (has passed consensus).

Since I’ve been using the latter to get permissions, you have to parse all the subscribers and find the __${n}__.__permission__ identifiers anyway.

Let’s assume there are N subscribers in topic, and M address added by group owner in permission. Typically N is very close to M.

The complexity (time, space, io, etc) to get permission by scanning all subscribers is O(N), while if we store the range of n in the meta of a special identifier it becomes O(M).
The complexity to get all recipients is O(N) if we use getSubscribers, and is O(M) with a larger constant if we use getSubscription.
Typically it’s probably better to use getSubscribers if one wants to send msg to all permissioned subscribers because it’s faster, but it’s not always true. A few counter examples:

In some cases like Chat-based customer support built on nPubSub we just need to get a random permitted subscriber from a random permission bucket. In such case storing n in meta leads to O(1) complexity.
In the case where M is small and attacker increases N rapidly (e.g. an important small private group), we can somehow switch to getSubscription instead such that the total complexity is O(M) and will not be affected by attackers.

What I was suggesting is that, we store n in some special identifier as part of the protocol. In your implementation probably you can still use getSubscribers now, but it leaves space to use getSubscription once we need it without changing the protocol.

lynn · January 22, 2020, 7:55pm

After updating, there’s a new kind of issue: you cannot overwrite your pool in the tx at all, because it throws “duplicate tx in block”.

So now we cannot update the settings while they’re still in mempool, which makes it impossible to add new settings before old ones have resolved.

Can we add an option to do that?

edit: never mind, that option already exists