Expand
-
Collapse
Generic Object-Sharing Protocol
Node Identification
Every node generates an id string by random
This is the node id (node identifier)
Only on first use
This should be globally unique
It will be stored in node's database for later reuse
A hash is being generated of it
Hashed data:
Node's IP number and hostname
Some random characters
This id does not change as long as the database is not purged
Per session another id is generated
This is the SID (Session IDentifier)
It is being distributed to the nodes
It stored together with the Node-Id
So other can validate bother together
Logging should only enabled for debugging purposes
Locking IPs or Node-Ids on master-nodes is not planed
Censorship would be to very easy
Government agencies or enterprise parties
Censhorship makes no sence here
It can very easy be bypassed:
Delete Node-Id in database
A new one got generated
Locked IP or port number can be bypassed by proxies
One or two master-nodes should listen on ports commonly unblocked by firewalls
Like 80/443/110/25
Hubs can be optionally registered by master-nodes
Increases karma because the node admin is verified
Unregistered nodes does not receive negative votings
Bootstrapping
At least one, better 3 to 4, master-nodes are required
Aka. "Bootstrap-Nodes"
They should be listed in the configuration for all applications
A comma-seperated list of node IPs with port numbers seperated by double-dot (:)
Bootstrap-Nodes are working stand-alone
No central "Super-Node" is required
Too much traffic would have to flow through it
Attacks on the network by censorship are reduced
Traffic does not increase network-overall load
Small disadvantage:
Hubs must register with ...
... more than one master-node ...
.. or connect with each other
1. Node checks if there is a list of master nodes already stored
If so, it skips fetching the list of nodes step
2. Node is announcing itself to the upper bootstrap hub(s)
This should be done generic to keep things easy
Recommended is maybe an XML with all neccessary data
The session id will not be included here
A bootstrap node will never try to connect clients with nodes
It should only "bootstrap" (tell the node where it should start sharing its objects)
3. Node is fetching a list of other nodes
They must have at least X matching object types
If a bootstrap node is full it forwards the node to an other bootstrap hub
If that node is again full the node will be forwarded to a list node
If wether no free bootstrap nor list node is available the node waits some time and tries it again
Hashes of node-lists distributed over the bootstrap and list nodes should match
This can be ensured by a DHT
DHT = Distributed Hash Table
Which format?
If to much are inconsistent:
No connect can happen
Node list is rejected
Or the bootstrap-nodes are working as regular nodes
Replication of the node-list is required by all bootstrap-nodes
4. Node connects to gathered master nodes
It again announces its object types to the master nodes
Again it provides the session id so the master node can map session id -> node id
5. If all authorization steps are completed:
The node starts to accept client connections
(It already listens to them but rejects them)
Objects will now be shared with other nodes which accepts the same object types
Karma
Karma is given for validating entries in the DHT
Last activity in near past
Does not affect karma
Returned pings
Amount of sent pings
If no reply it got dead-listed
Failed pings reduce karma
Slow responses reduce karma
Karma voting for other nodes is not to negative
Reduces manipulation chances
Prefer karma votes of trusted nodes
Negative karma votings for untrusted karma reduce own karma
To much "spam packages" reduce karma
Validated packages increase karma
Protocol version should not be to old
This affects karma only negativly
An up-to-date protocol does not increase karma
Does also serve as a "spam protection"
Received protocol version of node is older than stored
Karma is reduced
Received protocol version is much than from master-nodes
Karma is reduced
Provided object types by the peer hub
This affectes karma only negativly
New types must first be known by masters
This should be configurable:
Karma should be reduced...
... or peer node should be black-listed
Because of every node can be a master-mode censorship is really hard
Correctly logging
Does not affect karma
Logout must be done by master node and active nodes
"Bye" message
Rotating of dynamic IPs should be considered
Must be registered by master-node
ID is registered as "Dynamic IP"
So connects are still possible
No negative votings by other nodes
Current IP does spread good in network
Query of the master-node only in doubt
Update Messages
Will only be broadcasted from bootstrap- to master- and list-nodes
No node will receive update messages due to heavy network load
Maybe only "good" nodes should receive this?
Contains update notes and importance level
"Client" Connections
Should be interpreted as "application software"
Clients should also generate a "client id"
Both id and sid
Will also connect first to bootstrap-nodes
Ask for a node-list as well
Do also receive karma from nodes
Dynamic IPs are also accepted and therefore must be registered
Client<->Node Communication
After a client has bootstrapped it announces all it's object types to the nodes
Including acceptance of broadcasts, poll-mode and Ping-POST
By this the nodes know clients and their accepted object types
Clients may download a node-list for a specific object type
Distinct-List-Mode
After selecting a node the client can request a list of clients from that hub
From these clients the client can accept objects from and send to
E.g. news by broadcast
Clients may send "broadcast" objects
Broadcast-Mode
Must be allowed by nodes
This consumes traffic
Acceptance of broadcasts is known to list-/master- and bootstrap-nodes
A client sends its broadcast to the master-nodes
They are distribute it to their fellow nodes
A node knows which client accepts broadcasts and "deposits" it for the client
Clients are requesting such broadcasts by poll-mode or are "pinged"
In poll-mode the client asks on a regular basis at the node for new broadcasts
A Ping-POST is being sent by the node as a regular HTTP-POST request to the client
This also happens on a regular basis
A node-admin may allow both types independly
If none is allowed the node acts as a "relay"
And therefore it cannot accept clients with broadcast-functionality enabled
Client-Client Communication
May be done "anonymously" over the node or directly with an other client
Communication of the node is done in poll-mode or by Ping-POST
Direct client-client communication client "A" sends a Ping-POST directly to client "B"
Wrongly sent Ping-POSTs (e.g. the admin doesn't allow them) may be answered with a regular HTTP status '4XX'
Usage of low-level protocols
Already existing low-level protocols like TCP/IP and UDP should be used
TCP should be used for "inter-communication"
UDP should be used for "streaming" the objects to other nodes
Parties are generating hashes of chunks for validation
Chunks should only be created for very big objects
Total object size is larger than X KByte
The sender creates hashes and adds them to the chunk
The receivers validates them
No serial numbers a-la TCP are generated
The last chunk package contains both hashes
Hash of itself and the final hash
If a hash fails to validate it is being collected
After the final chunk was sent, failed chunks a re-requested
This is retried X times per hash
But always at the end of the whole transaction and all together
If still some hashes failed to transfer
The object got dropped or fully requested
This should be configurable by the admin
To do so, the final hash and object type is submitted to the sender
"Restransmit-Message"
The sender is now trying smaller chunks
If all was successfully received
The receivers sends a "done-message" to the sender with final hash and object type
There is also a "real" streaming mode
This is e.g. used for chat
For this TCP/IP is used and no hashes are generated
Also no chunks are generated
Only in this mode "multi-casting" is possible
Fault Tolerance / Reliability
After X failed connection attempts a node got removed
Other nodes report this to the master-node
The master-node probes the failed node and removes it
Failed list-node
Hubs are reporting it to the master-node
The master-node probes the failed list-node and removes it
Failed master-node
List-nodes takeover the role of a master-node if no bootstrap-nodes are available
This takeover should not be entirely and should be defined
If there is no list-node, nodes look for an active master-node
They report the failed master-node to it
If additionally no master-node is up, a node will be elected as new master-node
Doing so, all nodes are identifying the node with...
... the best karma
This is known to many nodes
... most votings
A "vote" is a positive karma
Also known to many nodes
The "election" should take place within a specific timeout
If no election is happening the node with most connections got elected
If one of the bootstrap-nodes is up
The elected nodes notifies a some of it's fellow nodes that the bootstrap-node is back
The elected node becomes a regular node and notifies other nodes on connection attempts
Disadvantages:
A new node with only knowlege about the bootstrap-nodes may not be able to connect to the nodes
Additional bootstrap-nodes on other server and/or continent may help here
Object Types
New object types are only addable by updating the software
It also possible by 3rd-party
Must be known by master/bootstrap-nodes
Outdated object types are marked "deprecated" for a longer time
Master-nodes may accept or reject them
A "deprecation message" is always being sent
A note of a required update can optionally be added
After deprecation time they are treated as "unknown"
Other nodes should ask bootstrap-nodes
This compensate errors made by master-nodes
Wrongly deprecated object types by the master-node result in bad karma by the bootstrap-node