2. Paths and the Ninja Architecture
Figure 1 shows a high level picture of the Ninja architecture. Ninja consists of two broad categories of components: arbitrary, user defined services and infrastructure to advertise, make highly available, and connect those services. The service discovery service (SDS) provides advertisement for services; the MultiSpace provides availability and fault-tolerance by replicating services and maintaining persistent storage; paths compose compatible services and maintain data streams between them.
Services register themselves with a central SDS server which advertises at a well-known broadcast address. Programs in need of specific services can then query the SDS. Unfortunately, the SDS by itself does not provide a mechanism for determining the type of data a reigstered service accepts or produces, or the communication protocol of a service.
Related to the SDS is the XSet, which consists of a database of XML descriptions (of services in SDS), and provides a simple query language whereby descriptions containing a specific set of XML fields can be located.
The MultiSpace is a specific instance of a Ninja infrastructure component known as a base. Bases are simply machines or clusters of machines on which services run. MultiSpaces generally run on clusters of machines, and provide availability by replicating services across several machines. Typically, services running on MultiSpaces also use hard-state, which MultiSpaces maintain via disk-clusters. Another Ninja-base construct is the iSpace, which doesn't offer the same availability or durability features of the Multispace but is more light-weight and can be run on single end-user machines.
A third Ninja base is the active proxy: active proxies act as interfaces between arbitrarily simple devices and the Ninja infrastructure. Since Ninja is relatively heavyweight, relying on a full fledged Java implementation, proxies are integral in that they allow components which don't run Java to participate in the infrastructure. Examples of such components include: pre-existing services, like web-sites, firmware-driven devices, like cell-phones, and low-computing power devices, like PalmPilots.
The primary mechanism for communication between services in Ninja has traditionally been Remote Method Invocation (RMI). Anytime a service want's to query SDS, XSet, or another service, it uses RMI.
Paths fit into the context of Ninja by providing a mechanism whereby services, running on a base, can be (automatically) interconnected via streams (not RMI) and made to interact with each other. The Paths Package uses SDS to locate services, but imposes structure on the services which it handles. Services specific to paths are known as operators; they are distinguished from arbitrary services in that:
(1) They input and output data via some streaming protocol; higher-level communication, such as remote procedure calls, are not supported.
(2) They provide an XML description of their input and output data types and semantics. This description must conform to a specific, well-known schema.
The Paths Package is able to use these two constraints to connect operators together; (1) guarantees all operators can exchange data with each other, although they may not be able to make sense of that data, and (2) makes it possible to determine which operators can understand each other (speak the same data format).
As the rest of this document details, the Paths Package provides modules to build paths of inter-operable operators, to initiate data-flow down these paths, and to recover from failure in the event that one or more operators in a path goes down.
3. Terminology
Before describing each of these modules in detail, a summary of the terminology of paths is necessary. As discussed above, paths consist of chains of interconnected operators. The connections between these operators are maintained via connectors, which are code modules that implement specific streaming protocols. Connectors are limited in number and merely provide a transport layer with no understanding of the structure of the data they are transmitting. Examples of connector types are TCP or IP.
There are two incarnations of paths: a logical path is a sequence of named operators without specific machines to run those operators; a physical path is created when the operators in a logical path are augmented with specific machines known to be running those operators. Instantiation is the process of building a physical path from a logical path. Operators in a logical path are known to have compatible input and output types; compatibility is enforced by the automatic path creator (APC), which checks the XML of each of the operators in a path for compatibility. If two adjacent operators in a path aren't compatible, the APC attempts to insert compatible operators; if this isn't possible, the path is not allowed to be instantiated.
Once a physical path has been instantiated, it must be implemented: a separate thread is started on each operator, and a connector stream to the previous and next operator threads are installed in each thread. The first operator in the path, called the source, can now begin writing data down the path with the expectation that the last operator, the sink, will receive that data in a format it understands.
7. Path Instantiation and Implementation
Once a logical path has been built by the APC, two steps are required before data can begin flowing down it: instantiation and implementation. The process of instantiation works as follows: for each operator in the logical path, query the SDS for all the running instances of that operator. Randomly choose one of the instances of that operator as the host for this operator on this path. Random selection is done to decrease load on any one particular operator: when many paths are running concurrently, random selection produces a uniform distribution of paths across all instances of operators.
The path instantiator includes one further optimization: operators may choose to register themselves as uploadable by including a URL where their code can be found. Non-uploadable operators are called fixed. Why would an operator want to be fixed? There are two primary reasons: either, the host machine is an active proxy connected to some resource which can't be moved to another machine, or the operator requires some specific feature of the machine it is running on (such as a very fast processor or the high-availability of a MultiSpace) which is not available on an arbitrary machine. The source and sink operators are also considered fixed.
Uploadable operators are uploaded and instantiated on the machine running the source operaor, to decrease the amount of network traffic through a path, assuming most operators are uploadable. Notice that it would be preferable to upload operators onto the same machine as an adjacent fixed operator, since that would guarantee a reduction in network traffic. Unfortunately, implementing this is hard since machines with fixed operators are likely dedicated to some specific task and may not allow new code to be uploaded -- exploring ways of working around this is important future work.
In the current implementation, uploadable operators must already be running somewhere. A preferable solution would be to allow operator authors to register their code with some central service. Code could then be uploaded without first contacting the author's machine. Providing this functionality requires a central database of code or code locations; such a database has yet to be implemented, but would also be a valuable future addition.
A path can be completely or partially re-instantiated without first contacting the APC; this is important because it enables rapid path recovery (see section 8) and also makes load-balancing feasible by re-instantiating overloaded operators (a feature which is currently unimplemented.)
Once a path has been instantiated, it needs to be implemented before it can run. The first step in implementation is to assign a unique path-id to the path. This path-id is used to refer to the path on each running operator, as described in section 5 above. Operators are then implemented in a pairwise fashion, as follows:
All paths must have at least two operators: a source and a sink. Notice that the first operator does not have an input, and the last operator does not have an output. The control-path is an RMI-stub to an operator to notify in the event of a failure; typically it is the source operator (see section 8 on failure recovery). The control-path will also be important in future work to allow operators to relay flow-control information.
A thread for an implemented operator will usually loop, reading some data, transforming it appropriately, and then outputting some data. The operator thread is started only after its input and output connectors are in place, so it isn't required manage any connection specific information.
Both the instantiator and implementer are implemented as static classes. Because they don't store state and aren't particularly CPU intensive, this is to implementing them as standalone iSpace services which have a high overhead due to RMI calls.
Appendix A: Sample Operator and Path Building Script
One of the features of the Paths package described in this document is that it makes it very easy for operator authors and users who wish to make use of Paths services. Imagine we want to write a simple operator to convert Pig Latin to English. Two steps are required: first, an XML description of the operator must be written. In this case, our operator has one input, which is Pig-Latin Text, and one output, which is English Text. Although this description appears long, these descriptions are easy to derive from the DTD presented in the APC section.
<?xml
version="1.0"?>
Once the XML description is written, the code for the operator is very simple -- the input and output streams are fetched and written to. The operator does not have to open or close its streams:
public class PigLatinToEnglish extends Operator
implements OperatorIF, SDSOperatorIF {
- public PigLatinToEnglish() throws RemoteException {
- super();
- }
- public void run() {
- BufferedReader reader;
BufferedWriter writer;
String s, outs;- //fetch streams for our instance of the operator
- Object pathId = getPathID(Thread.currentThread());
reader = new BufferedReader(new InputStreamReader(getReader(pathId)));
writer = new BufferedWriter(new OutputStreamWriter(getWriter(pathId)));- while (true) { //loop forever -- this thread will be terminated when no more data to send
- try {
- s = reader.readLine();
outs = new String();- //convert s from pig latin to english
- ...
- } catch (IOException e) {
- }
- }
- }
- }
That's all that needs to be done. Note that if a stream failure occurs, the input and output stream wrappers will kill the operator automatically and forward the error up the control path. Also notice that buffered readers and writers function properly with this error recovery facility.
Building a path is also straightforward. The user simply specifies the source and destination of the path, asks the APC to connect it, and then runs the instantiator and the implementer.
The path is now fully implemented: instances of each of the
threads in the path are running, and data is flowing.