[NKP-0021] Prefer stable nodes in routing to incentivize nodes to be more stable

yilun · February 4, 2022, 1:07am

There are a few points that we might have a disagreement on.

If routing is probabilistic, it will cause a much larger jitter than you thought, can even be something like 100ms. This is because when packets are sent to different nodes, their neighbor set and next hop will be different, and the path will diverge. Just in case you don’t know, the routing is based on DHT space for security reasons, so different path could have very different latency, causing a significant jitter. Similarly, for packet ordering, probabilistic routing will make packet order from mostly ordered to mostly unordered.
Jitter and packet ordering are more important than you thought. In general NKN network is used in two ways: one is using raw client/multiclient packet like UDP packet (e.g. d-chat, nMobile, etc), in which case rough ordering is much more friendly to application side than unordered delivery. As an example, let’s say we want to build a realtime voice communication app. If packets are mostly ordered, a packet can be added to play buffer the moment it arrives. But if they are mostly unordered, about 2 * jitter or more latency needs to be added before it’s added to buffer. The other way of using NKN is to use the session mode (think of it as TCP), in which case packet ordering is even more important, because session needs deliver packets in strict order. In this case, packet ordering will significantly reduce session latency, increase throughput and reduce client side resource consumption.
Having a cutoff will not make the network fragile. If the weight function is continuous (this is important), with a cutoff at the end, near the cutoff point it just degenerates to the current behavior where uptime is not a factor. And I don’t think the current network is fragile.

Besides, if you take a weighted approach, the node count will grow faster because new node operators won’t be stymied by a lack of rewards for such a long period if time. (This is the #1 problem of NKN because it’s directly tied to marketcap growth.)

This is definitely not right. Node count will always be stabilized at the point where mining is just about to be profitable. To be more precise:

node count = total mining reward in NKN per month * NKN price / average node cost per month

total mining reward in NKN per month is constant and slowly change with time, and average node cost per month is also roughly constant, so the node count only depends on token price. Choosing different weight function will only affect who will end up with mining, but won’t change the network size.

Also, why not send packets down duplicate routes? Packets cost like picodollars to deliver. Who cares if the cost of doing that more reliably is double the nonredundant cost? This would be antifragile and performant.

On the long term, bandwidth cost will be the major cost of a node. Actually it is already the major cost on some platforms. Sending a packet in 2 different routes will double the bandwidth cost of every node in the network, which will reduce the network size by about half when bandwidth is the major cost.

Performance is important but ecosystem growth is king. Billions have been made by companies producing junk software far worse than anything NKN has to offer, simply because of adoption and standardization.

I definitely agree. But by saying adoption and standardization, we are talking about developer and application side who use NKN. For them, network performance, developer friendly, etc are the relevant factors, that’s why we should always make NKN easier to use and have better performance.

Fred · February 4, 2022, 2:20am

I understand your points about latency vs. ordering entropy. I never imagined that the latency could ever be so horrible without strongly ordered nonbranching routes. Given that, I would rather have high bandwidth and high latency than low bandwidth and low latency. The former is what you get when you optimize for ecosystem buildout; the latter is what you get when you pursue realtime apps like video conferencing. If it were up to me, NKN would not pursue realtime service delivery because the bandwidth requirements are fairly constant for the vast majority of users, so existing ISPs have largely already optimized for that. (100 Mbps is a basic install in most major cities now, which will suffice for most remote worker needs.) On the other hand, there are some extremely bandwidth-intensive computational applications that can eat terabits per second, but they don’t need it instantly. ISPs suck at delivering that economically. NKN could excel there and carve out a huge niche for its growth.

Fragility arises from changepoints in network dynamics. If the derivative of the reward with respect to uptime is discontinuous, then you could get a scenario where many machines go offline with the same periodicity, resulting in undesirable resonances (even if they’re mutually phase-shifted) and all the resulting fragilities. (This sort of issue comes up all the time in phase-locked loop design in electronics, and you don’t even have malicious actors there.) I’m just saying: analytic functions are safer than merely continuous functions. It’s up to you, though.

The argument you made for threshold profitability determining node count does indeed make sense. So then, if you optimize first and foremost for node count growth, then by implication you’re also optimizing for token price growth which would encourage more entrepreneurs to set up nodes. In other words, if you can present a credible and highly competitive roadmap, the prospects of rapid token price growth will attract more nodes, even if they’re temporarily unprofitable. It’s the same reason why landlords in cities with rapid growth prospects are often willing to lose money on the rent for a while. “We’re going to compete with commodity ISPs and mobile carriers on realtime communication” is not a credible roadmap in my view, however much I might wish it to be. “We’re going to provide the bandwidth that will enable Web3 services to operate with maximum cost-effectiveness” is much more believable and motivating.

My assessment is this: You people are really smart engineers with an excellent understanding of the relevant aspects of network behavior and consensus. You understand the limits and capabilities of your nodes really well. You’ve written flow control software that seems secure and reliable. But you’re optimizing for connection quality when you should be optimizing for ecosystem growth. If you can pull off both, I’ll be happy to look like a fool, but I think that’s exceedingly difficult.

yilun · February 4, 2022, 8:26am

Fragility arises from changepoints in network dynamics. If the derivative of the reward with respect to uptime is discontinuous, then you could get a scenario where many machines go offline with the same periodicity, resulting in undesirable resonances (even if they’re mutually phase-shifted) and all the resulting fragilities. (This sort of issue comes up all the time in phase-locked loop design in electronics, and you don’t even have malicious actors there.) I’m just saying: analytic functions are safer than merely continuous functions. It’s up to you, though.

If the first derivative is a potential cause of the trouble, what we can do is choosing a function that has zero derivative at the cutoff point. This way the curve will have continuous first derivative and discontinuous second derivative (at the cutoff point). It seems second and higher derivatives are not meaningful to miners, so that shouldn’t be a problem.

Given that, I would rather have high bandwidth and high latency than low bandwidth and low latency.

Actually with limited send/receive window size, low latency will lead to high bandwidth, and high latency will lead to low bandwidth. This is true not only for the session mode in NKN, but also true for TCP or any other protocol that needs strict ordering. That’s why I’m not a fan of probabilistic routing – the jitter it introduces will increase effective latency and thus reduce throughput, making both metrics to be worse.

In other words, if you can present a credible and highly competitive roadmap, the prospects of rapid token price growth will attract more nodes, even if they’re temporarily unprofitable. It’s the same reason why landlords in cities with rapid growth prospects are often willing to lose money on the rent for a while.

I’m not sure if this is true for crypto mining. If I’m temporarily losing money on mining but I still expect the token price to grow in the future, the better way for me is to pause mining for now, and just buy token from the market. This way I get more token with the same cost. The difference between mining and landlord is that, if you don’t start building cities now, you might not have the opportunity (or will have higher cost) in the future. But for NKN mining where there is no specialized hardware, there is no difference if someone joins now or in the future.

But you’re optimizing for connection quality when you should be optimizing for ecosystem growth.

IMO they are the same thing, or at least on the same track. There are two ecosystems: the mining ecosystem and the user ecosystem (dev who use NKN). The user ecosystem is the one that needs more improvement now, and it’s the one that eventually determines the utilization, adoption and the upper limit of NKN in the long time. Optimizing connection quality is definitely helpful in optimizing the user ecosystem.

Fred · February 5, 2022, 3:58pm

I agree with your proposed solution here.

I think what you’re implying is that the act of sorting packets more than cancels the advantages of having many parallel routes. That seems improbable unless we assume some bad architecture in which packets share hardware pages or the stack is written in Python, etc. Are you really sure about this? I would expect a sweet spot where the buffer is large enough to accomodate many contributing senders, but small enough to make the (N log N) packet sorting (ordering recovery) acceptably fast. Unless there’s some other source of latency I’m ignoring.

I understand your point that miners should only be willing to work when the price of NKN implies that they’ll be sufficiently profitable. In practice, I think this is only true from quarter to quarter or year to year, which are long periods of time in the crypto world. The reason is that it takes a long time to get set up and learn what works and what doesn’t, even apart from issues such as uptime score. Even though it might be smarter for me to turn off my miner and go buy NKN, I probably won’t do that because it takes a very long time to really understand how much income I stand to make. So I need to leave the miner running for months, at a minimum. Now, if I can see a promising roadmap, I’m more likely to tolerate unprofitability in the near term, knowing that if the token price explodes later, I’ll already be optimized to profit.

If you’re really convinced that optimizing for low latency and high bandwidth is equivalent to optimizing for ecosystem growth, I won’t argue with you. In theory, that would be correct. I’m just wary because I’ve seen garbage software companies with lots of useful but buggy features destroy brilliant software companies with fewer but more performant features, again and again, throughout the history of the industry. It’s not a law of nature, however, so perhaps my concerns are unfounded in this case. I hope so.

EDIT: Perhaps there’s a way to have the best of both worlds. What about adding a bit somewhere that says “optimize for realtime streaming” or “optimize for best efforts file transfer”. In the second case, your packets are all XORs of random subsets of the original packets, so you can reconstruct them at the receiving end with the help of a small percentage of redundant packets. Then all you need to do is sort by original file ordering, and you’re done. Security is enforced by digital signatures, although checking for random errors can be done by simpler means. (I realize that the nodes operate on packets, not files, so there would need to be some interaction between these different levels of the stack.) Doing something like this would allow your miners to also profit from extremely cost-sensitive data that can tolerate high latency.

yilun · February 6, 2022, 11:06am

I think what you’re implying is that the act of sorting packets more than cancels the advantages of having many parallel routes. That seems improbable unless we assume some bad architecture in which packets share hardware pages or the stack is written in Python, etc. Are you really sure about this?

It has nothing to do with implementation, it’s just a theoretical limit of throughput caused by finite send/receive window and latency (because reliable ordered transmission requires ack). To be more precise: throughput <= window size / RTT.

I would expect a sweet spot where the buffer is large enough to accomodate many contributing senders, but small enough to make the (N log N) packet sorting (ordering recovery) acceptably fast. Unless there’s some other source of latency I’m ignoring.

Large buffer (send/receive window) can indeed increase throughput under high latency, but will also increase RAM usage on client side proportionally. This is practically a pretty common bottleneck on client side, especially mobile device. We struggled with that when developing nConnect for iOS, where we have to reduce buffer size to fit into the system RAM limit for network extension.

If you’re really convinced that optimizing for low latency and high bandwidth is equivalent to optimizing for ecosystem growth, I won’t argue with you. In theory, that would be correct. I’m just wary because I’ve seen garbage software companies with lots of useful but buggy features destroy brilliant software companies with fewer but more performant features, again and again, throughout the history of the industry. It’s not a law of nature, however, so perhaps my concerns are unfounded in this case. I hope so.

I’m not saying optimizing performance is equivalent to optimizing eco system, I just meant it’s beneficial for (developer and user) eco system growth. So ideally we don’t want to sacrifice that.

EDIT: Perhaps there’s a way to have the best of both worlds. What about adding a bit somewhere that says “optimize for realtime streaming” or “optimize for best efforts file transfer”. In the second case, your packets are all XORs of random subsets of the original packets, so you can reconstruct them at the receiving end with the help of a small percentage of redundant packets. Then all you need to do is sort by original file ordering, and you’re done. Security is enforced by digital signatures, although checking for random errors can be done by simpler means. (I realize that the nodes operate on packets, not files, so there would need to be some interaction between these different levels of the stack.) Doing something like this would allow your miners to also profit from extremely cost-sensitive data that can tolerate high latency.

That’s exactly we are doing now. We use redundant multi-path packets in multiclients for lower latency and higher reliability, optimized for low throughput application like messaging. We use non-redundant multi-path packet transmission with our NCP protocol for reliable, ordered transmission without wasting bandwidth, optimized for high throughput application or application that requires reliable data transmission. Back to the topic of this post: these protocol helps achieve the theoretical limit of the end to end performance, while having better routing algorithm (like deterministic routing) helps improving those theoretical limits. We try to have both optimized as much as possible.

Fred · February 7, 2022, 2:16am

I think I understand a bit better now what you’re doing.

But I’m confused as to why you’d want nonredundant transmission in cases where the receiver requires high bandwidth. Why not include some extra packets and attempt to learn the receiver’s bandwidth over time by measuring the ACK interval? (I mean: send optimistically after you think you’ve learned the receiver’s bandwidth, rather than waiting for a roundtrip again and again.) Yes, this might be a bit complicated due to routers with buffer bloat in the middle, but it seems like it should eventually settle into a steady state for huge file transmissions. I can think of a lot of backup processes which need massive bandwidth but literally don’t care if the latency is a minute. They just want cheap and fast, and they tend to have predictable bandwidth because they’re sitting on fixed broadband links.

In any event, optimization of your protocol is probably a game of diminishing returns (not that you’ve hit the wall yet). If that’s the case then you would have to find other ways to be more competitive. Faster internet connections help your competitors as well, so you can’t rely on that (or Moore’s Law, for that matter) for competitive advantage. Perhaps this yearlong thread has given you some ideas about your optimization assumptions. But if you end up with the same conclusions, then you’re probably approaching your efficieny limits, at least in software.

yilun · February 7, 2022, 2:38am

But I’m confused as to why you’d want nonredundant transmission in cases where the receiver requires high bandwidth. Why not include some extra packets and attempt to learn the receiver’s bandwidth over time by measuring the ACK interval? (I mean: send optimistically after you think you’ve learned the receiver’s bandwidth, rather than waiting for a roundtrip again and again.) Yes, this might be a bit complicated due to routers with buffer bloat in the middle, but it seems like it should eventually settle into a steady state for huge file transmissions. I can think of a lot of backup processes which need massive bandwidth but literally don’t care if the latency is a minute. They just want cheap and fast, and they tend to have predictable bandwidth because they’re sitting on fixed broadband links.

That’s what all ARQ protocol (like TCP or our NCP) is trying to do, but that doesn’t require any redundant packets. Actually when packets are flow continuous at max throughput, redundant packets will reduce the throughput to 1/(1 + redundancy_ratio) in the worse case (e.g. when sender’s uplink or receiver’s downlink is the bottleneck connection).

Fred · February 8, 2022, 11:06pm

But isn’t that formula only valid if you assume no lost packets? If you lose some small fraction of packets, then redundant packets can often repair the entire loss. If you have no redundancy, then you need to pay for a roundtrip just to request them again. That can be far more expensive than the redundancy.

yilun · February 9, 2022, 3:09am

It is an upper bound of the throughput, and the bound can only be achieved when there is no packet loss, jitter, etc.

When packet loss or timeout is non-neglectable, instead of sending redundant packet, it’s better to use error correction code (e.g. FEC) for optimal bandwidth usage. However, for NKN client session using NCP, packet loss rate is really low because most nodes choose to use TCP as their transport layer, and the benefit of redundancy is neglectable.

Fred · February 13, 2022, 1:23am

To be clear, when I said “redundant”, I meant a distributed ECC, as in, all packets are actually xors of random subsets of the original packet set. (You could also use linear combinations with some caveats.) Then linear algebra could extract the all the original packets. If you lose a small number of packets, then you only need to send roughly the same number of extra packets, unless the subsets are sparse, in which case you need to send a lot more.

Now, if your packet loss rate is honestly negligible because your rate matching is so excellent (due to TCP), then you’re right; you don’t even need an ECC. Just a cheap EDC will suffice to confirm whether or not a random error has occurred.

My own sense, in the case of big data clients, is that a greedy approach which saturates all the links in parallel, and uses redundant packets of the aforementioned composition, is going to deliver the packets faster than an efficient approach like TCP which is forced to pay the roundtrip(s) required in order to form a good estimate of receiver bandwidth. The only downside to my approach is that reconstructing N packets requires an NxN matrix inversion (slightly worse, actually, due to the overcomplete dictionary), which is O(N^1.5). But this would be thin work for a GPU in most cases.

My point is really just that network needs vary widely among big compute loads, archival work, and mobile messaging and video. It might pay to keep an open mind about radically different protocols for such radically different applications.

For the moment, though, I’d like to thank you for considering all of this. Maybe it will give you some other constructive ideas indirectly, even if you disagree with my hypotheses. Good luck and thanks for the continued development effort.

yilun · February 15, 2022, 8:13am

I think your approach is in line (rather than different) with modern TCP-like ARQ protocol (e.g. KCP), which uses both ECC and ACK mechanism. Actually, ACK mechanism is always needed regardless of whether ECC is used or not because reliability should be guaranteed regardless of packet loss rate, while for any ECC there is always a packet loss limit where payload cannot be recovered. This is a fundamental limitation regardless of the ECC implementation (linear combination, XOR, etc). So round trip time for ACK is always needed for reliable transport protocol.

For NKN, we are definitely open to adding ECC when needed, especially if people start to use UDP as low level transport protocol.

And thank you for the discussion as well, it’s very inspiring. We are far from optimal in terms of both tech and ecosystem, so the discussion can definitely continue as we move forward

yilun · February 24, 2022, 3:10am

The initial implementation of this NKP can be found at PR 837. The latency multiplier in this version is

(1 + 9 * (1-sqrt(1-(x-1)^2)))

where x = uptime / 72 hours.

The curve and parameters are subject to change in later versions while we collect more data and feedbacks of this version.

Fred · March 5, 2022, 4:08pm

I think your square root approach isn’t a bad place to start. As you said, you can optimize over time in light of the network performance that you observe.

My only point about ACK is that you don’t need it after every N packets. You just need a burst of them at the end of the entire transfer (in case some get lost). This mean you pay one and only one roundtrip time. I’m not sure what percent of transfer latency is currently spent waiting for ACKs, but the importance (or lack thereof) of this issue really comes down to that. Of course, you may or may not need ECC of any particular robustness, depending on your typical loss rate, which I understand to be very low.

Anyways, I’m pleased to see the progress on PR 837!

yilun · March 5, 2022, 9:28pm

Thanks for your feedback! It’s important part of NKP design.

You just need a burst of them at the end of the entire transfer (in case some get lost).

Actually this won’t work if we want to preserve the reliable transmission assumptions and be compatible to TCP. to be more specific, we want:

transmission should be reliable, no packet shall be lost when delivered to receiver
transmission should be in order, the packet order delivered to receiver needs to same as the order sender passed in

Note that these assumptions might be loosened for specific application (e.g. file transfer app can transmit data out or order and write to disk), but as a general transmission protocol that is aimed for TCP compatibility, these assumptions are required.

Give these assumptions, if you only send ACK at the end of the transmission, in the worst case (the first packet is lost), the memory cost of receiver buffer will be O(N) where N is the number of packets of the whole transfer because the first packet has to be delivered to receiver first. And that is the cost per connection. This is definitely not feasible in practice.

Genvin · May 29, 2022, 8:33am

您好，我有几个疑问：

请问NKN数据传输是怎么实现实时跟踪（或监测或测量）到邻居节点的时延的？是邻居节点定时广播自己的时延吗？
请问需要中继证明的是客户端还是节点？还是两者都需要中继证明？
Hello, I have a few questions.
How does NKN data transmission achieve real-time tracking (or monitoring or measuring) of latency to neighboring nodes? Is it the neighbor node that broadcasts its own latency at regular intervals?
Is it the client or the node that needs relay proof? Or do both need relay proofs?

yilun · June 17, 2022, 7:08am

请问NKN数据传输是怎么实现实时跟踪（或监测或测量）到邻居节点的时延的？是邻居节点定时广播自己的时延吗？

这个是节点自己测量的，和 ping 类似，发送数据包然后测量多久能收到回复。

请问需要中继证明的是客户端还是节点？还是两者都需要中继证明？

节点需要用中继证明来挖矿，不过中继证明里面也包含首尾客户端的信息