Spark RLD: Dynamic Hypercube Routing for Blockchains
With a bit of background on other elements of Spark under our belt, it is time to cover resource location and discovery (RLD for short) within it. Eagle-eyed readers would have noticed a hint for this in my previous post, Part IV: Spark IDs which included the above image of a hypercube.This involves the routing systems used by the Spark network. Beyond bootstrapping, which usually involves out-of-band mechanisms to get going, there is the more important issue of routing on the network to find other nodes.I will cover Sparks details directly in a subsequent post, but to start I wanted to have some elements to reference: a revised set of introductory content from way back (more specifically, weblog posts back from when I was completing my Ph.D. thesis on Manifold in 2002).Spark RLD: Manifold ReduxFirst off, the Ph.D. thesis abstract Submitted to Trinity College Dublin on 30 September, 2003.Sorry about the use of the royal we but this is pretty much a copy/paste of the abstract from the dissertation. Also: maybe it takes some flights of fancy in terms of possibilities, but thats the point of research, isnt it?Anyway, here goes:Self-Organizing Resource Location and DiscoveryNetworked applications were originally centered around backbone inter-host communication. Over time, communications moved to a client-server model, where inter-host communication was used mainly for routing purposes. As network nodes became more powerful and mobile, traffic and usage of networked applications has increasingly moved towards the edge of the network, where node mobility and changes in topology and network properties are the norm rather than the exception.Distributed self-organizing systems, where every node in the network is the functional equivalent of any other, have recently seen renewed interest due to two important developments. First, the emergence on the Internet of peer-to-peer networks to exchange data has provided clear proof that large-scale deployments of these types of networks provide reliable solutions. Second, the growing need to support highly dynamic network topologies, in particular mobile ad hoc networks, has underscored the design limits of current centralized systems, in many cases creating unwieldy or inadequate infrastructure to support these these new types of networks.Resource Location and Discovery (RLD) is a key, yet seldom-noticed, building block for networked systems. For all its importance, comparatively little research has been done to systematically improve RLD systems and protocols that adapt well to different types of network conditions. As a result, the most widely used RLD systems today (e.g., the Internets DNS system) have evolved in ad hoc fashion, mainly through IETF Request For Comments (RFC) documents, and so require increasingly complex and unwieldy solutions to adapt to the growing variety of usage modes, topologies, and scalability requirements found in todays networked environments.Current large-scale systems rely on centralized, hierarchical name resolution and resource location services that are not well-suited to quick updates and changes in topology. The increasingly ad hoc nature of networks in general and of the Internet in particular is making it difficult to interact consistently with these RLD services, which in some cases were designed twenty years ago for a hard-wired Internet of a few thousand nodes.Ideally, a resource location and discovery system for todays networked environments must be able to adapt to an evolving network topology; it should maintain correct resource location even when confronted with fast topological changes; and it should support work in an ad hoc environment, where no central server is available and the network can have a short lifetime. Needless to say, such a service should also be robust and scalable.The thesis addresses the problem of generic, network-independent resource location and discovery through a system, Manifold, based on two peer-to-peer self-organizing protocols that fulfill the requirements for generic RLD services. Our Manifold design is completely distributed and highly scalable, providing local discovery of resources as well as global location of resources independent of the underlying network transport or topology. The self-organizing properties of the system simplify deployment and maintenance of RLD services by eliminating dependence on expensive, centrally managed and maintained servers.As described, Manifold could eventually replace todays centralized, static RLD infrastructure with one that is self-organizing, scalable, reliable, and well-adapted to the requirements of modern networked applications and systems.The 30,000 ft. viewResource Location, Resource DiscoveryIn essence, Resource Location creates a level of indirection, and therefore a decoupling, between a resource (which can be a person, a machine, a software services or agents, etc.) and its location. This decoupling can then be used for various things: mapping human-readable names to machine names, obtaining related information, autoconfiguration, supporting mobility, load balancing, etc.Resource discovery, on the other hand, facilitates search for resources that match certain characteristics, allowing then to perform a location request or to use the resulting data set directly.The canonical example of Resource Location is DNS, while Resource Discovery is what we do with search engines. Sometimes, Resource Discovery will involve a Location step afterwards. Web search is an example of this as well. Other times, discovery on its own will give you what you need, particularly if the result of the query contains enough metadata and what youre looking for is related information.RLD always involves search, but the lines seemed a bit blurry. When was something one and not the other? What defines it? My answer was to look at usage patterns.Its all about the userIts the users needs that determine what will be used, how. Here, user isnt a person (My stance why people arent users should be well known by now) but rather user as originally intended in the first multi-user systems. More often than not, RLD happens between systems, at the lower levels of applications. So, I settled on the usage patterns according to two main categories: locality of the (local/global) search, and whether the search was exact or inexact. I use the term search as an abstract action, the action of locating something. Finding a book I might like to read and Finding my copy of Neuromancer among my books and Finding reviews of a book on the web are all examples of search as Im using it here.Local/Global, defining at a high level the depth that the search will have. This means, for the current search action, the context of the user in relation to what they are trying to find.Exact/Inexact, defining the fuziness of the search. Inexact searches will generally return one or more matches; Exact searches identify a single, unique, item or set.These categories combined define four main types of RLD, which are more easily understood using examples:DNS is Global/Exact.Google is Global/Inexact.Looking up your own printer on the network is Local/Exact.Looking up any available printer on the network is Local/Inexact.Now, none of these concepts should come as a shock to anybody. But writing them down, clearly identifying them, was useful to define what I was after, served as a way to categorize when a system did one but not the other, and to know the limits of what I was trying to achieve.The Manifold AlgorithmsWith the usage patterns in hand, I looked at how to solve one or more of the problems, considering that my goal was to have something where absolutely no servers of any kind would be involved.Local RLD is comparatively simple, since the size of the search space is going to be limited, and I had already looked at that part of the problem with my design of a system for ad hoc wireless networks. Looking at the state of the art, one thing that was clear was that every one of the systems currently existing or proposed for global RLD depends on infrastructure of some kind. In some of them, the infrastructure is self-organizing to a large degree, one of the best examples of this being the Internet Indirection Infrastructure (i3). So I set about to design an algorithm that would would work at global scales with guaranteed upper time bounds, which later turned out to be an overlay network algorithm (which ended up being based on a hypercube virtual topology), as opposed to the broadcast type that Nom was. For a bit more on overlays vs. broadcast networks, check out my IEEE article on the topic.Then the question was whether to use one or the other, and it occurred to me that there was no reason I couldnt use both. It is possible to to embed a multicast tree in an overlay and thus use a single network, but there are other advantages to the broadcast algorithm that were pretty important in completely disconnected environments such as wireless ad hoc networks.Thus Nom became the local component, Manifold-b, and the second algorithm became Manifold-g.So, Manifold is composed of two different self-organizing algorithms. The first, Manifold-B (Manifold-Broadcast) is used for local/inexact searches, while Manifold-G (Manifold-Global) is used for global self-organizing exact resolution.Manifold-BManifold-B is a broadcast-style (or flooding-style) p2p system. Basically, on startup a node looks for one or more nodes already in the network (using a list of previously known nodes if available or other out-of-band methods) and establishes connections to some or all of them depending on individual node load, etc. Once at least one connection is complete queries can begin to propagate. Queries are constrained by a time-to-live (TTL) parameter, propagating across all connections of each node as long as the maximum number of hops has not been reached, as is common in these types of networks.Some advantages of Manifold-B are:Low consistency requirements (ie., no topology structural requirements).Allows substring match and multiple key insertion per node.Fast in its local environment, with network load controlled by number of nodes.Extremely well-adapted to ad hoc wireless environments.Some disadvantages:Local only (not global), and thus limited in reach.More overhead than Manifold-G for exact queries.Manifold-GManifold-G is an Overlay, basically a virtual topology built on top of an actual network. Why an overlay? Because when building a virtual topology, we can give it a well-defined structure, that is predictable, with lookup path lengths bounded by a certain limit.In the case of Manifold-G, the topology used is that of an n-dimensional hypercube (or n-cube, or l-cube, depending on the letter used to specify dimensions :)). Manifold-G assigns each name a particular bit-string, which is controlled by the node with that name (local control was an important component of name resolution, other overlays dont necessarily guarantee locality in that sense, and although you could force them to, they are mostly designed as distributed hashmaps, which maintain a set of keys on each node which has other great uses!).The tricky part is that a hypercube can only be navigated predictably when it is complete, i.e., when all its nodes (vertices) are present. So Manifold-G uses the predictability of a structure to its advantage to virtualize those missing nodes. More on the specifics in a later post (if you cant wait, you can also check the text of the dissertation).Manifold-G advantages:Fast! Number of steps to target (or not found reported) is a maximum of O(log N) with N the number of nodes in the network.There is no penalty for more nodes joining the network, in fact the network works better with more nodes because there are less virtual nodes to take care of. (This is weird, I know, but true).Disadvantages:High consistency requirements. This is probably the weakest point of the algorithm, very clear during joinand leave operations of a node, which require a multi-node transactional status to keep the hypercube topology intact and not break searches. This can be minimized by pre-loading information.A neighbor in Manifold-G can actually be on the other side of the Internet, which can be a problem when slow connections are involved. In small networks (e.g., a few dozen nodes), Manifold-G actually has more overhead than Manifold-B, because the paths in the hypercube do not necessarily adjust to the actual topology. This advantage of Manifold-B is most notable in small-medium ad hoc wireless networks, which do broadcast at the physical layer level. In larger networks, some geolocation-based optimizations as well as proxying are possible, as mentioned in the dissertation.SidenotesAside from serving well for the two main usage modes we are targeting, the algorithms complement each other. For example, Manifold-B has much lower consistency requirements and so can operate when Manifold-G might not be available, and thus either partially cover some queries, or help Manifold-G in building the network.It has been shown that it is possible to embed broadcast trees within overlays, but in my opinion maintaining two networks is a small price to pay for the advantages of both, particularly on wireless ad hoc (small-to-medium), where Manifold-B has a clear advantage over the overlayÂ·RELATED QUESTIONI want a musicmaker that does everything and is compatable with other softwares and midis?The difference between the Micro and the full Maschine is the controller, the software is identical so no extra ram or whatever needed. I have both a Micro (1st edition) and the regular Maschine (mk2) and I recommend getting the regular one if you have budget and studio space. It's just much easier to work with having more controls and dual screens.I use it in combination with FL Studio Producer Edition, and load the Maschine software as a plugin. I also use Maschine as a MIDI controller in FL Studio, works perfectly.In case you are interested in buying FL Studio you can get it with 10% discount with the promo link for new image-line customers https://support.image-line.com/jshop/shop.php?offerDefault&promoBACJBAA468As far as hardware goes, it kind of depends on the DAW you want to use, but having a good set of monitor speakers and a decent audio interface is always a good investment. It will help your productions.