[NKP-0013] P.E.D.S - Persistent data storage

hello,

while working on a dapp i noticed pubsub is great but the messaging systems lacks a way to store persistent data, it will not take us far enough like this! so another idea came into my minds…:

for e.g. this usescases:

  • cache for CDN like scenarios
  • storing user profiles
  • offline msg delivery: sending someone who is offline an message, receiver can read it when he is back online again
  • showing a chat history not only the current messages

a persistent data storage would be highly needed. could be implemented like this (all for free would bring abuse potential!):

  • node owners can specify if they want to provide PEDS storage (not suitable for small cloud instances…) and their price NKN/MB storage (can be zero!), should self regulate the market this system. please note nice people will simply provide free PEDS storage!

  • each user can create a peds_create type transaction, specify up to x (= redundacy number he likes) nodes that hold his data (encrypted). when storage expires, holding nodes get paid (or donated if users donates and node offer price is 0)

  • in the NKN client he can view his personal PEDS storage and delete things he does not need.

there could be an dapp developed which can be used for storing personal data like dropbox which causes users to buy NKN - causes price increase + show in the nkn explorer the current offerings, prices etc of peds nodes.

I’d say it would be very useful if we have this, however, it’s waaaaaay more complicated than you think. We would need to deal with data integrity, proof of storage (proof of space and time), nodes disappear, etc. Basically this is what Filecoin or other decentralized storage is doing if we are talking about file/blob store, or Bluezelle or other decentralized database if we are talking about key-value store… And none of them have even launched their mainnet yet :joy:

I would say the best strategy might be collaborating with decentralized storage/database project so that we don’t need to re-invent the wheel.

hello,

yes it is true, many coins have this situation now, they want e.g. smart contracts, decentralized file storage, vpn, chain swaps etc. and they can decide:

  1. reinvent the wheel - waste lots of miney, time and other resources
  2. try to partner up / integrate with other coins.
  3. make a very small, simple version themselves

btw. i found ot btfs https://docs.btfs.io/docs/what-is-btfs offers exactly this it seems, decentralized storage worth looking into it, its in testnet or devnet stage it seems

idk whats best but its really about time that chains e.g bluezelle and nkn somehow can integrate each other with apis. or maybe dapps devs just use multiple chains/projects for their dapps e.g. bluezelle for storage, nkn for communication.

but maybe its possible for NKN to have a very simple approach, if NKN does communication only its not so attractive for anyone than if it can provide persistent storage etc.?

simple approach:

  • node can chose to have PEDS enabled, set price (NKN/MB saved&transfered) & amount of storage
  • NKN users creates peds_create TX and chooses e.g. on nodes x,y and z 10 MB for price y & 60 days, for bets safety payment will be made by NKN system when rental period is over.
  • if 1 rental contracts node is offline for more than 1 day, user will not be charged and informed that he needs to select new nodes

its a big decision for any chain now - reinvent wheel themeselves or partnerup e.g. NKN users can pay NKN to storj coin to rent storage and have API, a decision thats difficult to make

The simple approach you described is not as simple as you might thought… A few simple examples:

  • As a storage solution, you are expecting it to be reliable right? But what if x, y, z all get offline? What if when you are getting content, the node gives you wrong content?
  • Should the content be encrypted? If it’s encrypted, how can you share it to other user? If not encrypted, the storage node or anyone else can see the content.
  • Who determine a node is offline or online? Who determine if user should be charged or not? Who determine a node is storing the correct content, or just randomly generate content when user is requesting content.

There are a lot more questions to solve even for the “simple” approach…

hello yilun,

possible solutions - maybe it is possible to find a simple solution:

As a storage solution, you are expecting it to be reliable right? But what if x, y, z all get offline? What if when you are getting content, the node gives you wrong content?

–> users need to chose a redundancy level based on data importance, e.g. r = 2 for less and 16 (r = data stored on r nodes) for more important data. after each expired store contract user can make an peds_review TX (its requires a rental fee payment in the same tx! to make a review) and give a rating (int, range 0-10) + text comment to rate their storage experience e.g. 10/10 “fast node, data well stored for 6mo”. in the blockexplorer reviews of ratings of nodes can be seen in addition to uptime, storage capacity etc. like this https://siastats.info/hosts.

resulting effects:

  • fast, trusted notes have good ratings, can chose higher prices, will be used for important data. faulty nodes will be not used much. if someone wants to fake ratings he needs lot of time and NKN since ratings are possible at contract end only and when paid. just like in reallife - read reviews of others, chose a trusted provider :slight_smile:

Should the content be encrypted? If it’s encrypted, how can you share it to other user? If not encrypted, the storage node or anyone else can see the content.

-> to keep it simple: 3 options:
a) encrypted: for each peds contract create a private key stored in renters wallet.
b) option not encrypted: renter can chose to store for data thats not sensitive
c) symetric encryption aka encrypted with password e.g. with AES

Who determine a node is offline or online? Who determine if user should be charged or not? Who determine a node is storing the correct content, or just randomly generate content when user is requesting content.

instead of complex technical solutions, i would suggest simply PEDS payments for storage are like donations - there is no checks or guarantee that renters will pay but if most pay the system works well.
anways storage is cheap so its not a gold mine to be expected, SIA, STORJ and other systems - no goldmines there!

BUT if someone makes a payment transactions (TX type peds_contractfee_pay) it is technically verified that is is atleast the initially agreed price, NKN coins will be freezed at contract start to ensure enough balance it there!

general observation:

if someone wants to go the way to use other blockchains for data storage: keywords are bluzelle, ipfs, btfs, ethereum swarm, ipfs for storage, some of this project looks like they can be used right now to store and provide data in a decentralized way - but NKN needs some fame and features also i think!

This sounds like a much more mature plan!

Just to be clear, personally speaking, if someone decides to build something like this, I will definitely support it, but I just want to make sure I understand why people want to do it first.

IMO, a better XXX is always not a good idea. For example, if we just want something exactly like Storj, but just using NKN token instead of ETH, then I think it’s definitely better to just use Storj. Or probably better, with some cross-chain solution as the bridge between NKN and ETH to achieve native-like behavior. Ideally if we decide to build something, we would want it to have enough unique advantages.

I would start with a very rough (and definitely not mature) idea and hope we can think more on top of it or come up with something better: NKN provides an off-chain data transmission layer, and we can utilize it to provide a simple storage utilizing NKN off-chain data transmission capacity (but not part of nkn core). We can use subscribe to post storage provider info (capacity, price, etc), user can use publish to get contact with provider and probably measure performance, and use the p2p message to get/put/delete files (or key-values, depending what we want). We can use NanoPay which (although not officially announced yet but already usable) should be enough for the off-chain payment part. This is unique to NKN because no other blockchain has the off-chain data transmission layer as we do. It does not has the guarantee of the on-chain solution so it might be less reliable (both user and provider), but it should be fast enough and much much easier to implement.

hello,

ok it becomes clear: building a decentralized storage like storj, siacoin, filecoin costs a lot of resources (money, time), is difficult & needs really well proven components in the areas replication, fault tolerance, contract management etc.

ok yilun, we can leave this idea here, maybe someone picks it up, i am involved in other things so i cannot.

in general the challenging situation:

to have a reliable, well working fast decentralized persistent storage system you need all this (redundant stored data, live recovery / if nodes fail, copy to other ones, accounting (pay for amount of data stored & transfered) features implemented which can be offered usually if you pay for it.

i found alternatives, i focused on free decentralized storage but all have some issues. i think for best reliability developers should simply use paid decentralized storage when it becomes alive e.g. sia or filecoin:

ethereum swarm: i looked into ethereum opensource swarm project’s codebase (see https://swarm-guide.readthedocs.io/en/latest/introduction.html) - “Swarm is a censorship resistant, permissionless, decentralised storage and communication infrastructure.”. there right now you can connect to their swarm, send files (they will be distrubuted in chunks across swarm nodes), get a file ID back and retrieve the file later.

issue here: anyone can use the swarm, you do not have a guarantee that your data will be stored, if someone loads a lot of data into the swarm yours maybe gone.

iota tangle: suprisingly, in each IOTA transaction you can send some data and sending them is free, you just need to verify other transactions while veryfing yours. issue here: about every month “snapshots” get made, clears this zero value transactions.

now you can host your own swarm or tangle but need access control so not strangers fill up your nodes with bogus data, which makes the whole thing centralized again :frowning:

very simple solution (but not really scalable) would be to store the data onchain. e.g. some transaction type that you can attach custom data payload to and pay according to payload size. but because it’s onchain and persistent payload should be limited in size.

also, this is easily doable w/o separate tx type once we have smart contracts.

very simple solution (but not really scalable) would be to store the data onchain.

i thought of that but it will just use all the diskspace of all full nodes, it should be somehow seperated. and more features like pruning, pinning, taggin etc are needed still to work well.

i was thinking we could make a private ipfs network (means we can reuse well tested, good working ipfs code) + encourage people of NKN community who have free disk space to enter the network. a small application could allow only NKN holders (checks using NKN api) to enter (where founds have not moved for 24h) to enter the network + airdrop every month some nkn of miner bonus to participants.

@yilun what do you think of this solution? i think its an easy way to provide NKN (dapp) developers with some storage. not always a complex difficult technical solution is needed.

1 Like

Why not just using the public version of ipfs?

Also I don’t think it’s a good idea to incentivize such storage using NKN token without a proven proof of storage system, otherwise it will become a disaster. That’s one of the reasons ipfs has been free for years and why we need Filecoin when we want incentives.

Thats sound good!

hello,

i suggested an separate ipfs or ethereum swarm network ONLY for holders of NKN because otherwise you have all kind of traffic there. this systems act like a cache and have usually e.g. 10gb local cache storage, means you do not want too much unrelated data there, anyone can join, send around his own data much and drain the caches like that.

What I meant is that, if we just want a free storage, we don’t need any IPFS storage node at all since the public ipfs network is free to use. Actually we have used it in the web3infra d-chat to store user uploaded images. But the public ipfs network is very slow and not quite reliable. I’m not sure if having a private ipfs network can solve that :joy:

hello,

am not sure how we can proceed best. i am sure about the requirements i and probably others have:

  • be able to have some storage that is reliable (fast, redundant) and ideally free or by doing some pow or storing others files e.g. needed for dapps with history, profiles etc. smart contracts are too expensive for storing/execution in general!

maybe like IOTA can provide also free transactions (their ide: send free tx by confirming 2 other TXes + some pow, maybe in storage world they same could be possible - get storage by storing others files).

Hey @zero24x I’d like to invite you to try out the new version of https://github.com/nknorg/nkn-file-transfer to see how do you think it would work as a PoC of the simple storage.

On the storage provider you can run

nkn-file-transfer -receive -host

And on the user you can run

nkn-file-transfer -send -get

and then use

GET address/file

and

PUT address/file local_file

to get and put files just like HTTP GET and PUT.

@yilun:

Thanks for the work, instead of focusing on too much details - let us think of the bigger picture:

Improvement idea: Web gateway or FTP protocol

If it can act like a webserver with content request etc. make can you expose a port to download files, maybe someone wants to host a public gateway? Like IPFs, you can use CLI or local http server or public gateways to DL files.
Or maybe use FTP or SFTP protocol so you can connect e.g. with Filezilla or cyberduck and easily transfer files?

Core Integration:

Maybe could this tool be builtin into the NKN core client command line client? It would look cool if you can announce, there is a new feature: file transfer / storage in NKN. The more features NKN core has, the more attractive it gets!

In addition to this you could create a d chat room where people can sell their storage or something (all based on trust, no automatic redundancy, storage completeness etc.).

Bonus: Someone makes a GUI.

Just like you can send coins in NKN you could send files also.

Still I somehow like the IDEA u can utilize NKN core for storing files, but it adds a lot of complexitiy (redundancy, contract breach checks, auto failover etc.) we discussed previously sadly.

I have too many ideas and no idea how to do things the best way lol

Test results of what you have implemented so far

Speed: Works nicely, I was able to max out my test machine upload speed (6 MBIT)

UI: If its not too complex maybe you can show the speed in B/s KB/s etc. I am sure a library helps, I mean please show friendly human units e.g. 10MB/s not 10000… bytes.

File consistency: I did just 1 test, works nice, 60MB transfered 100% correctly, wow!

Also when u have a GUI you can post nice screenshots of a new NKN based tool in discord, twitter etc. and everyone will be hyped!

Notes:

  1. Are you using a chunk size of 1024 bytes? I know it depends on the connection speed, latency etc. but maybe raise default value 50kb (less overhead, faster), but idk. maybe ur client transfer 1024 @ 8 clients in paralell even?

The current version actually already has an http mode, and very importantly, it supports HTTP range header, which means you can stream while downloading. A few example:

A image: http://157.230.136.253:8080/b0561ed79ad20380ef582202213dc806dac03736988ae40dc05d5905407038d9/latency.png
A small video: http://157.230.136.253:8080/b0561ed79ad20380ef582202213dc806dac03736988ae40dc05d5905407038d9/abstract_video_1.mp4
A large video: http://157.230.136.253:8080/b0561ed79ad20380ef582202213dc806dac03736988ae40dc05d5905407038d9/ml.mp4
A 720p video:
http://157.230.136.253:8080/b0561ed79ad20380ef582202213dc806dac03736988ae40dc05d5905407038d9/animal.mp4

The servers are temporary and we might close or change them at any time, so if it’s not working, you can test it on your own computer.

The chunk size is 1024 bytes in order to max out TCP packet to MTU. Any larger value shouldn’t help but will increase latency. Also currently there are 8 clients and 1024 concurrent workers, which means at most 1024 packets can be sent at the same time without receiving any recipient confirmation.

@yilun

wow thats cool, yes HTTP range is great since it allows streaming and also ressuming downloads e.g. if big file transfer fail because WIFI got lost.

ok yes, the 1024 byte choice looks good now, now that i know why you chose it.

now i think there is a big challenge:

if you put the transferclient on github, it will get “lost” there, i mean 90% of the users are GUI users, so they cannot download a GUI there and also it is not linked on the website…

suggestions:

  1. build a GUI & release gui + cli binaries for linux, mac & windows on github

  2. or alternatively make it part of NKN core + release a GUI, maybe a community member (i do not have time for it) can help here? i think this would work as an non mandatory update since it does not change consesus, so not a big deal because only ppl who want i need to upgrade! it would make the NKN core stronger & more interesting if the core includes file transfer. maybe a community member wants to host a public transfer gateway also (for some NKN?)?

  3. in any way i think it is good if this file transfer clients gets somehow ready to be used by everyone through GUI + cli binary releases + some “marketing posts” in discord etc.

Yes agree, CLI is not friendly to most users. Currently it’s just something we are trying out now, but later we’d like to build or let someone build a read app that most people can use