A Development Environment For Robotic Software
Fry, cfry@media.mit.edu , February 18, 2017

As consumers demand "mass customization", the hardware of our robots is meeting the challenge by becoming more capable. They have more degrees of freedom, higher accuracy, greater speed, and additional sensors. In order to take advantage of this hardware, we need more sophisticated software. This is as it should be. It is easier to change a line of code than to change a transmission line of electricity or light.

Software
Unfortunately the cost of software development can be quite high, especially if that cost is not amortized over the manufacture of many copies of the same thing. The operators of these robots may be factory workers, mechanical engineers or even home users for personal manufacturing. With the luxury of professional programmers not available, we must make "end user programming environments" that facilitate part customization, fixing build-environment specific bugs and ultimately, design of whole new parts.

3D printers are perhaps the most common "manufacturing robot" in homes, but they typically use just one-material and one-process for manufacturing. Complex products will need multiple materials and multiple processes (CNC, laser cutting, sintering, pick and place, assembly, etc.).

As the complexity of the build processes grows, so too must the job description. A conventional "app" with a few buttons, can't accommodate the breadth of customization necessary. A user will need to interact at multiple levels that will permit an ever-increasing control over complex procedures. Ultimately we need the full power of a general purpose programming language. Our build process description looks less like a list of numbers and more like a full-fledged program.

Robots
The Dexter Development Environment was designed to program the rather adept "Dexter" robot. It is a 5-axis arm designed for table or ceiling mounting. Dexter contains a general-purpose processer that runs Linux as well as an FPGA supercomputer for high-speed parallel operations.

As flexible as Dexter is, complex multi-robot builds may involve not just multiple robots, but robots of different kinds. DDE"s architecture accommodates multiple kinds of robots that can perform in concert. The simplest robot kind that DDE supports is called "Brain". It can't perform physical operations on its own, but it can direct other robots to do so, just as an orchestra conductor doesn"t play an instrument, but directs others that do.

The "Serial" robot is a non-specific robot controller that drives processes through a serial port, such as the USB port in a laptop. It has been used to control an Arduino. We might think of a Serial robot as a sort of general purpose percussionist, who can play a variety of instruments dependent upon the piece to be played (or rather built.)

The "Human" robot is perhaps the most controversial yet conceptually interesting. Complex jobs may not be able to be fully automated. The Human robot has an instruction set that includes operations only a person can perform. These are either a textual description of a task, or some choice that the human operator needs to make. The important idea is that an instruction for a human can be stuck on a do-list and treated just like any other kind of instruction. We can thus synchronize human tasks and automated tasks in one coherent environment.

The Human robot is the singer in our orchestra. It does not need an instrument to perform its part. We can even use DDE jobs to coordinate multiple humans, a chorus if you will!

Unlike a typical musical performance, we don't require the person to memorize their part. Text, menus, other graphical widgets and even speech can be used to make life easy for the singer of the band. Our conductor tells the singer when to start singing, though the singer must tell the conductor when she is done. This allows the person to take as much time as the task requires, giving her maximum flexibility to perform her best.

An example of "lyrics" that DDE presents to the "singer".

Self Teaching
If this complex software environment is to be effectively used, it must be understood by its users. DDE utilizes a variety of media for explaining itself to users. Thus DDE performs the role of "music teacher", though not so much for training performers as training composers, who write the score (do-list) for each instrument (robot). Because a do-list is a carefully ordered sequence of instructions that must be synchronized with other instruments, our instructions bear more than a passing resemblance to notes in an actual score.

Above we see the 4 panes of DDE. The upper left pane is the code editor. It can contain any JavaScript. The content in this screen shot was inserted from an example on the Jobs menu (shown expanded). A job feeds the instructions on its do-list to a robot. This insertion comes not just with the code, but with comments explaining the code's functionality (in brown text). This particular example contains two jobs that coordinate by "synchronizing" at each of their "sync_point" instructions.

Every character in the editor has help available by clicking on it. Here we clicked on "sync_point". Concise help appears in the Output pane in the lower left. The blue text in this help is a link to extended documentation from the reference manual, shown in the Documentation pane in the upper right. The lower right is the Simulation pane, which shows a graphical simulation of the robot under control, or mimics a real robot.

Orchestras usually have scores with every note written out ahead of the performance. However, our real-world builds need to be more flexible than that. Robot sensors can detect anomalies that may need to be addressed during the build, just as professional musicians can cover up for each other's mistakes.

Since the do-list can contain arbitrary JavaScript, it can actually generate new instructions on the fly, much as a jazz improviser composes on the spot. One of DDE's instruction types is literally a JavaScript "generator". It can produce a stream of instructions who's length needn't even be set before the generator starts producing instructions, perhaps like some stage-hogging lead guitarist.

A Natural Language Application
We can take advantage of DDE's window system programming, speech generation & recognition, natural language parsing capability and AI reasoning to make easy-to-program, as well as easy-to-use, interfaces, to sophisticated applications. In this example, we use Google's speech I/O and the MIT InfoLab Group's English parser (named START) to build an application that lets a user create a knowledge base and ask questions about it, with a speech interface. Note that this application does not control the Dexter robot, but it does show some of the generality that DDE employs to describe complex tasks.

The high level interface is:

If a user presses Click to talk before saying each of the above example sentences, they will create a knowledge base and get speech feedback after each sentence, ending with "Robot is useful because robot have hand."

Levels of Programming
Our base level of programming for this application is JavaScript. We layer on DDE utilities, making access to I/O and parsing easier in the next level up. Finally we use a knowledge base and reasoning for our natural language interface. If a user needs functionality not available at a high level, they can drop down a level, remaining in the same development environment, to gain increased breadth at the expense of using a lower level language.

It is important to understand that the above English sentences are, literally, code. They create a program with a definitive behavior that we can access via asking questions.

Debugging
The majority of a programmer's time is not spent typing in code, it is in debugging that code. Analogously, the majority of our natural language coder's time will not be spent speaking the code, it will be in debugging.

A winning strategy for developing working code quickly is to get feedback from our development environment incrementally as we build our application. As we construct our "Robot Knowledge Base Application", the development environment gives us feedback after each sentence. For the two sentences that add knowledge to the knowledge base, "A robot has a hand." and "If a robot has a hand, a robot is useful.", our development environment responds with "OK" and "Got it." (Identical acknowledgement in English is boring an unnatural!) This informs our natural language coder that their utterances are syntactically correct.

If we consider that the goal of our app is to answer the final question, the intermediate questions are probing the semantic validity of the knowledge base being built. They take the place of "print statements" in traditional programming. However, as you can see, they are much higher level than print statements. They don't need to be embedded inside some complex algorithm, and they produce output in a very understandable way. For instance "Does a robot have a hand?" is responded to with "Yes, a robot has a hand".

Knowledge Base Debugging
Applications that rely on knowledge bases have a special kind of bug: bugs in the knowledge base itself. We therefore need to be especially careful to inform the natural language coder what's in that (potentially huge) knowledge base, but only with what's relevant at the moment.

Here's a dialog our app supports:

Human: A robot has a hand.
DevEnv: OK.

Human: A robot has a hand.
DevEnv: I already know that.
Our application responds to the first sentence with "OK" but the 2nd sentence with "I already know that." This informs our natural language coder about redundancy, much as a fellow human would in a conversation. Furthermore, our app does not stick into the knowledge base the same knowledge twice.

Here's another example of a dialog:

Human: Why is a robot useful?
DevEnv: I think its never the case that robot is useful.

Human: If a robot has a hand, a robot is useful.
DevEnv: OK

Human: Why is a robot useful?
DevEnv: Sorry, robot is useful, is only true under certain conditions.

Human: A robot has a hand.
DevEnv: Got it.

Human: Why is a robot useful?
DevEnv: Robot is useful because robot have hand.
Notice that there are three different responses to the Human's question of "Why is a robot useful?". Each informs the user about what is in the knowledge base at time time of the ask, giving clues about one of the most common bugs: missing knowledge.

Conclusion
Dexter Development Environment lets a user "print" a part from a design contained in a Job's do-list. But this rather passive interaction is analogous to an audience member at a concert. Via end-user programming techniques, we enable the user to orchestrate the process, ultimately giving them control of even the low-level instructions in the process. With the "Human" robot, we can synchronize what machines do best with what people can do best by providing appropriately timed instructions, similar to real-time lyric presentation in a Karaoke bar. The complexity of composing build-scripts can be mitigated by DDE's self-teaching techniques and the AI of natural language processing plus knowledge base management, that we've only just begun (apologies to the Carpenters).