Wednesday, November 19, 2008

(Software) Agents and the Semantic Web

I thought about why the concept of Agent has not taken off, why people/companies do not use them as widely as they use other technologies. Two reasons came in mind.

First, there is the security issue. How does one know that the software agent has not been tampered with, or that the information it caries is secure enough, or that other agents it meets are "good" agents, not "bad" ones. Since the concept of agent has to include the property of being autonomous, how can we restrict the agent in its actions and decisions without greatly reducing his autonomy?

Second, the job of an (software) agent is to talk to other systems, interpret and gather data, make decisions on that data, and present it back to the user in a readable and useful way.

While the first reason might have been already overcome, the second has not. In my opinion, to overcome this, one has to use the Semantic Web. If you add meaning to data, agents can interpret it and make proper decisions, without having to ask the user for guidance at every step (thus loosing its effectiveness, and some of the main properties it's supposed to posses, like being autonomous) . I am sure this is not an novel idea, and that people have already thought about it, but for me it makes sense. I envision an "Agent Store", or "Agent Market", where people would go and "rent" or "buy" agents to fulfill their immediate or long time needs, such as paying all utilities (power, phone, internet, cable, credit cards, etc), or scheduling a doctor appointment, etc.
Wouldn't this be nice?

Tuesday, November 18, 2008

Frequently Forgotten Fundamental Facts about Software Engineering

Interesting article by Robert L. Glass, initially printed in 2001, vol. 18 of IEEE Software. The author writes about "forgotten fundamentals" facts regarding software engineering.

Wednesday, November 12, 2008

Self-Healing Hulls

In the 2008 November issue of IEEE Spectrum, there is an article that talks about self-healing hulls. The carbon-fiber composition of a yacht can considerably heal itself after a collision. You can improve the healing process by inducing a little electric current. The work is being done by Eva Kirkby, a graduate student from EPFL. The main idea is that carbon-fiber is composed of carbon fibers and epoxy; the problem is that in case of impacts, these materials tend to separate internally causing cracks parallel to the surface of the material. In order to counter this problem, the material is infused with hundreds of very small bubbles filled with liquid-monomer molecules plus some small particles of catalyst. The outcome would be a hardening of the material. In order to keep the concentration and size of the bubbles to a minimum, Kirkby incorporated into the composition wires of a smart alloy, an alloy that can return to its initial shape after being deformed by applying heat (electricity) through it. Great idea!

Dilbert's self-aware comic strip

In my recent posts I kept mentioning biologically-inspired concepts such as self-aware, self-healing etc. Here is a comic view of self-awareness:


I love it!

Monday, November 10, 2008

Do agile teams model or write documentation?

To be honest, I had the misconception that agility and modeling are at the opposite poles, that agile teams right little or no documentation. That all changed once I read an article from Dr. Dobb's Journal. Some of the reasons agile teams do up-front modeling is "to answer questions around the scope that they're addressing, the relative cost and schedule, and what their technical strategy is." Another reason is to better grasp and manage the complexity of system architecture.

Some agile modeling an documentation best practices are mentioned as doing "some initial requirements and architecture envisioning early in the project to write executable specifications via a Test-Driven Development (TDD) approach, to single source information whenever possible, to write documentation later in the lifecycle, to promote active stakeholder participation, to implement requirements in priority order, to include modeling in iteration/sprint planning activities, to create models and documents that are just barely good enough for the situation at hand, to model storm the details on a just-in-time (JIT) basis, to sometimes model a bit ahead to explore complex requirements, and to take a multiview approach via multiple models".

One of the complaints that exist with using agile methodologies is that they cannot be applied on large-scale projects and in large development teams. For those kind of projects, plan-driven, model-based solutions are better suited. To achieve this kind of scalability, there is an agile version of Model Driven Development (where MDA is one example of it), called AMDD or Agile Model Driven Development. The difference is that instead of creating extensive models, you create instead agile models. Furthermore, with AMDD, you do just a little of modeling, followed by a lot of coding.

Tuesday, November 4, 2008

Controlled Chaos

In the 2007 December issue of IEEE Spectrum entitled Controlled Chaos, the authors describe a new generation of algorithms based on concepts related to the thermodynamic concept of entropy, which is a measure of how disordered a system is. By the fact that malicious code changes the flow of data in the network, the entropy of the network is thus altered. The new malicious threat, called Storm, uses different ways to be installed on the host machine, mostly through email attachments. Hot do we protect the networks? First step is to know how the network traffic moves around the network. Such collections of data from nodes in the network are possible because routers or servers are configured in such a way as to provide information about the network traffic in form of source and destination IPs, source and destination port numbers, the size of the packet transmitted, and the time elapsed between packets. Information regarding the routers themselves is also collected. Such information is used by the proposed algorithms to build a profile of the network’s normal behavior. It is stressed that the entire network is monitored, not just one single link in the network.

The principle behind the entropy-based algorithms is the fact that "Malicious network anomalies are created by humans, so they must affect the natural "randomness" or entropy that normal traffic has when left to its own devices. Detecting these shifts in entropy in turn detects anomalous traffic." When the network has established patterns, any outcome that is different from the normal states of the network can be easily detected. Even if the malicious code manifests by downloading pictures from the internet, the fingerprint of the network would look unusual, different from what is expected, from how the network was used. The authors make an interesting point, namely that Internet traffic has both uniformity and randomness. A worm will alter both, making the traffic either more random, or more structured. In case of the 2004 Sasser attack, the information entropy associated with the destination IP addresses rises suddenly, indicating an increase in randomness in traffic destinations due to the scanning initiated by the infected machines, as it looks for new victims. At the same time, the entropy associated with the source IP addresses suddenly drops, indicating a decrease in randomness as the already infected computers initiate a higher than normal number of connections. The conclusion is that the network goes into a new internal state unknown before, hence easily detectable.

The Storm worm I mentioned at the beginning works in some perspective similar to other worms, namely new code is placed on the computer (because the user clicks on some attachment), which will make it to join a botnet. However, there are distinct differences between old warms and Storm. One of them is the way it makes the user click the attachment, like using a clever subject line for the email, or attachment name, related to hot topics that are currently on the news, such as elections, hurricanes, major storms, etc. Most importantly, Storm hides its network activity. It first looks what ports and protocols a user is using. If it finds a P2P program, such as eMule, Kazaa, BitComet etc, it will use that program’s port and protocol to do its network scanning. Storm will also look at what IP addresses the P2P program communicated with, and will communicate with them, instead of new IP addresses, which would trigger its detection. Furthermore, Storm will not spread as fast as it can, because it has a dormant and a walking mode. It will gather information for a short period, then it will go quit. Very interesting that Storm actually tailors its behavior based on the pattern of the network usage. How to detect Storm? The worm will still alter the network entropy. For example, during its active period, the host computer will send many emails, which is unusually for normal use. In addition, the port used is not 25. All these are hints that something is wrong inside the network.

A great article! Nothing short of what I am used to expect from IEEE Spectrum.