[NKP-0017] Cached message timelining

lynn · November 19, 2019, 12:32pm

The hope of this NKP is to improve/change the way in which nodes relay their cached messages when they detect the client coming online.

Background: in D-Chat, after being offline for a night, you might’ve noticed that when you come back online, the chat message history for last night is quite scrambled.
I think it is because the cached messages

are relayed in the wrong order, or
are sent in too quickly in succession, not giving the client machine enough time to do its computations per message.

These could be overcome in d-chat itself, I suppose, but perhaps the system is better if every upcoming program doesn’t have to keep track of message timelines by themselves.

The proposal is that the cached message relay order is improved by keeping track of the message timeline in nodes, and not sending all the messages in 0.01 seconds.

As it is, some nodes seem to keep track of the ordering quite well, and others not at all.

yilun · November 20, 2019, 12:56am

The message is indeed cached and delivered in order (see https://github.com/nknorg/nkn/blob/master/api/websocket/messagebuffer/messagebuffer.go#L24 ), but there is indeed no interval added and message is sent as fast as websocket conn allows.

Since tcp itself is ordered, I’m a bit surprised if the client does not receive cached message in order. Do you know if this happens quite a lot or just occasionally?

lynn · November 21, 2019, 12:38pm

Do you know if this happens quite a lot or just occasionally?

Happens a lot when there are a lot of messages, less often so when there are just a few. Perhaps there is a chance that this could be fixed in client-js by adding some buffer time.

yilun · November 21, 2019, 10:23pm

I have a theory:

Because we use multi-client, there are actually multiple clients receiving message, each connecting to a different node.

Now assume someone sent a series of message labeled by m1, m2, m3, m4 to the client while it’s offline, and let’s consider only 2 nodes connecting to 2 different sub-clients. When the first 2 messages arrived, the node buffer looks like:

node A: m1, m2
node B: m1, m2

But now node A become offline, or another node closer to client replaces node A’s position. In either case, another node A’ becomes the node connected to client, and the buffer becomes:

node A’:
node B: m1, m2

Then the next 2 messages arrives, and the buffer becomes:

node A’: m3, m4
node B: m1, m2, m3, m4

Now when the client get online, he will receive message concurrently from both node A’ and node B, and m3 might be the first or second message he received.

To handle all the possible scenarios caused by node join/leave, the most robust way is probably from the application side…

lynn · November 22, 2019, 9:27am

That all makes sense.

Extending multi-client, by adding some logic at web-socket connect, should be easy enough. I will look to it at some point.
At startup, I’ll use message timestamps to do some simple ordering before sending them to the event listeners.

Better ideas?

yilun · November 22, 2019, 9:51am

At startup, I’ll use message timestamps to do some simple ordering before sending them to the event listeners.

That sounds like a good idea to me

lynn · December 13, 2019, 3:49pm

D-Chat now uses

Let’s see how it fares.