Who am I? Software Architecture

Once upon a time there was a mad scientist who create LIFE in a laboratory late one night (crazy laughter).

So how did I get interest in Software Architecture? Back at Waikato University I'd become fixated on Machine Learning, specifically autonomous undirected learning. So for my MSc research/thesis (which stretched over a couple of years eventually and was worth 1/2 the course weight) I decided to "solve" it. The idea was to have a (simulated) child robot learner in a (simulated) robot arm blocks stacking world (simulated arm, simulated naive physics). I digested every paper and book on the subject of machine learning, logic programming, philosophy of science and cognitive child psychology. The proposal was eventually to write and experiment with a program for "paradigm-computer learning". From Kuhn's Paradigm shifts I wanted to explore what would happen to a learner who is primarily directed in their actions and thoughts by the current paradigm of how they believe the world works. With the ability to develop a theory (concept, causal laws) consistent with their current paradigm, or decide to reject the paradigm and throw out the current theory and start over again with a new paradigm. As a result the program needed a number of main functions including:

1 Simulation of the robot arm, blocks, and naive block stacking physics. The world had a number of different sized and shaped blocks, the arm could simply pick one up and put it somewhere (e.g. on top of another one), and the naive physics determined if it would stay put or if the whole pile would collapse (for most children this is the fun part). The blocks also had other features such as colour, texture, patterns, etc. None of these features "mattered". What mattered was the size, shape, and location of blocks relative to each other (i.e. a naive centre of gravity). There was no wind or annoying little sisters.

2 Related to (1) there was also a graphical representation (simplified) of the robot arm and current state of the world. You could watch the arm pick up a block, move it somewhere, drop it, and if the resulting state wasn't stable there would as a "crash" and the blocks in the pile would end up randomly scattered around the "floor".

3 The "brain" had several modules including:

3a Theory formation (inductive concept formation): Based on remembering previous actions and the results of them, develop a new theory that is consistent with the evidence AND consistent with the current paradigm. If this is not possible then throw away the current paradigm and decide on another one (3b). Note that a theory was a set of causal laws which predicted if a block would stay on after stacking or not. It involved time (using Allen's temporal logic), actions (what the arm did), and what the properties and relationship of the blocks in the current pile were (ignoring other blocks from memory).

3b Paradigm-generator: Given (some) memory of what has already happened (paradigms, actions and results, theories, but not complete knowledge as the "paradigm" colours or limits what it thought was interesting and therefore what it remembered), choose another paradigm to try. Note: A paradigm was highly simplified, it was just a set of "properties" and "relationships" that were currently deemed interesting. There were about 10 properties (e.g. colour, size, shape, etc) and relationships for horizontal and vertical block positions.

3c Choose the next action: Based on current paradigm, current theory, and memory of actions and results performed so far (but only partial information) see (3b) choose an action that maximises the chance of learning something "interesting", either by attempted refutation (best) or confirmation of the theory. The idea of refutation of theories as preferable came from Popper.

3d I think there was another module which kept track of what's was happening and provided a stream of consciousness explanation (in a text bubble on the screen) so I knew what was going on (this may have been part of the other modules). I.e. it could also express "emotions", it was "happy" when an experiment had gone well (i.e. prediction was confirmed and the blocks either stayed put or crashed depending on the prediction), or "annoyed" when this didn't happen (actually more "puzzled").

Now, if I'd implemented all this as a single monolithic Prolog program it would probably have "worked", but would have been hard to write and modify. So I wrote it as a series of sub-modules loosely coupled around the above functional divisions. In practice there was communication required between each module, and data which needed to be shared as well. In the long run I ended up with the largest known Prolog program at the time (10kLOC?), and a working program (that took 3 days of VAX 11/780 time to run through a couple of paradigm shifts and a few dozen actions while everyone else was away on holidays).

I found the MSc thesis the other day and scanned a few pages in. Here's the architecture diagram.
Rest of pages at bottom.

So this was my first experience of real software architecture, up until then I'd written large but monolithic programs, or small but "distributed" (E.g. using VAX mailboxes).

A simple way of thinking about software architecture is therefore how you "chop up a long program", or which bits belong together (more than other bits), and how do the chunks interact - i.e. cohesion.

Another famous approach is that architecture is whatever is hard to change (or expensive to change). This can include high level design, language, OS, some features (which may be hard to retrofit, e.g. undo), and "structure" or topology (e.g. client/server, n-tier, p2p, etc). I often think of something as having architecture if you do a architecture tradeoff analysis of the "features" and/or different alternatives or choices made - i.e. this is better than that for this reason for this goal, or if you change this feature then this is the impact, etc. If you can't do this it's probably not "architectural" enough at the granularity of examination (or just poorly architected!?)

I received a MSc (1st class honours) for this work :-)
Publications from this research were:

Brebner, P., “Towards a computational model of paradigms”, 32^nd annual conference of the Australasian association of Philosophy, Australasian Journal of Philosophy, Vol. 63, No. 3; September 1985.

Brebner, P., “Autonomous Paradigm-directed discovery of Naïve Scientific Theories by Computer”, First Pan-Pacific Computer Conference, Australian Computer Society, 10-13 September 1985, Melbourne, Australia, pp. 953-973.

Brebner, P., "Paradigm-directed Computer Learning", Thesis (MSc, Computer Science Department), University of Waikato, 1985, 332 pages (big for a MSc).

I recently noticed this work by Google using a similar environment, looks like there's nothing new under the sun or in Machine Learning?

https://www.newscientist.com/article/2112455-google-deepminds-ai-learns-to-play-with-physical-objects/

https://arxiv.org/pdf/1611.01843v1.pdf

PS I corresponded briefly with Misha from Google after I read about his project in the New Scientist. After seeing the article a few months ago, I emailed him about my work 32 years prior, and wished him: Good luck with your work, I hope it “grows up” faster than mine J

He replied: "It sounds like we are still struggling with some of the same issues you were looking at more than 30 years ago. How can machines form inductive theories from data? And very importantly, how can they use these theories to guide action for gathering information? I guess they turned out to be very hard problems :)

-- Misha

P2S
In the context of caching for clouds I wondered if anyone has thought of applying complexity metrics to measure the pros/cons of various cloud patterns and service compositions etc?
E.g. Cyclomatic complexity was widely used for standalone code bases and was extremely useful for deciding if and how to refactor/re-engineer the architecture of monolithic applications. Perhaps it's also use for cloud, micro services etc?

E.g. A book Complex Systems and Clouds, mentions Cyclomatic complexity (but doesn't really pursue the idea).

And this paper (now 10 years old) defines some metrics for the complexity of distributed systems (which requires a different metric to sequential monolithic code complexity). Has anyone applied or extended this work to cloud computing?

https://www.researchgate.net/publication/220657612_What_is_the_complexity_of_a_distributed_computing_system
https://pdfs.semanticscholar.org/38df/189911311e90ffb651506f0553d0b2f9f8a7.pdf

And this paper (also 10 years old) which measures the Cyclomatic complexity of some distributed systems patterns.

P3S
MSc thesis sample pages

And cartoon version of the robot doing experiments, thinking aloud, changing theories and paradigms etc (not actual screen shots). Drawn on the 1st apple Macintosh computer the department had (it had a mouse! You had to book days in advance for 1/2 hour slots, the IBM PC didn't have any booking sheet ha ha). Notice the change in the robots "emotions" as it is surprised by things.

1990s

My next major foray into Software Architecture was working for a UNIX startup company in Sydney in the early 1990s. At one level this was all architectural, as UNIX and UNIX applications and systems programs have architectural constraints and best practices for development etc. This was probably my first in depth experience of an architectural ecosystem that was comprehensive and well defended against all comers. UNIX was even derived from a Philosophy almost as old as Plato!
If you were modifying the UNIX kernel you jolly well better have understood this stuff or someone would would come and find you (and give you a good talking to). Was this software architecture? Well yes, but specifically Operating Systems software architecture.

My contributions to non-OS software architecture included were actually derived from my ML/AI work around modelling and generation of concepts. There were 2 problems in particular that were amenable to some of these methods. One was model driven test case generation, this was based on run-time analysis of the specifications and the hosting environment, and generated as many tests as necessary (and that could be run) in a fixed period of time given broad and deep coverage.

The 2nd was a protocol generator (actually an interpreter) for distributed systems. This was based on specifying the protocol in the logic programming like language, and then generating specific implementations for different target environments and "host" protocols. How were these architectural? Well simply by solving a complex problem in a way which didn't require manual re-work when there were changes in things that were outside your control and likely to change a lot (e.g. environments, protocols, etc). I recall the protocol generator worked really well as their was a "bug" in the protocol we had been given that we had to support for the target environment which we didn't find out about until the day before turn on. We changed the spec and regenerated the code in a matter of minutes. The poor customer engineer on the other hand (who had given us the buggy protocol and implemented it on his side with the bug) took hours/days to make the changes and delayed the turn on.

I worked for the ABC TR&D for a year in 1995 working on their D-Cart audio system. I invented and implemented a multi-media file system based around the use of Allen's Temporal Logic. This was a fundamental architectural choice and made it easier to relate different multi-media types across time including finding related material etc.

Product Architecture Evolution 2007-2017

In this period I was involved in R&D and productization of a technology for predictive analytics for performance engineering modelling. This involved making and changing the product architecture a number of times as it evolved from a prototype to a useable robust product with different deployment options. It started out initially as a pure Java custom application including a meta-model, a simulation engine, and other components including GUI and XML input/output. We wrote the meta-model architecture/framework and simulation engine ourselves, but used open source tools for the GUI (a graph/network visualization library), graphs (open source graph API) and XML persistent library. This version was easy to understand, modify and maintain.

The next version was a major architectural and technology change to a SaaS platform. This required a change in the meta-model to an SQL database (probably the wrong choice as it made it too hard to change the meta-model), browser based GUI in javascript with a choice of Javascript framework and having to write the GUI components (model visualisation and graphing) from scratch without the benefit of libraries (also a long term problem), the use of multiple languages (Javascript, Java, SQL) and frameworks, and the use of open source technologies (e.g. web server, Cassandra, Apache Spark security, account management, etc). This made the product bigger (50k LOC now), harder to maintain and modify and understand and harder to configure deploy and maintain on customer machines. The plan was for everyone to use it as a SaaS, but in practice many customers would not let their data outside their firewall and we had to deploy a copy inside on their systems.

3rd change was when we found the Javascript framework was too slow/buggy and had to change to another one. This caused potential problems with maintaining and running 2 versions with different frameworks (as it takes time to change everything over to another framework, more than 2 weeks!). Luckily we found we could run both together.

4th change was to deploy it in a docker container to enable simple deployment to different platforms, including AWS.

5th change was based on all the above experience (so probably not possible earlier) when I wrote a new version without a model or simulation engine from scratch in a week in only a few 100 LOC. It does 90% of the 50kLOC version does and integrates trivially with Dynatrace and Google Sankey diagrams. It's ultra fast so there's no need to store results, just rerun, and uses a transformation (data-to-data) approach to enable building "models" from the current system, and then running a past workload against the model to predict any performance/scalability problems. The idea of manual model building and changing and even visualization was too cumbersome and unnecessary for many DevOps type use cases, and the model/simulation is also unnecessary for these use cases. This is still a prototype so may evolve and grow over the next 10 years...

Search This Blog

A computer scientist learns Amazon Web Services (AWS)