Wednesday, September 23, 2009

Why POCO is well implemented and designed?

When i searched for a description of POCO i found the following definition:

"The POCO C++ Libraries are a collection of open source class libraries for developing network-centric, portable applications in C++. POCO stands for POrtable COmponents. The libraries cover functionality such as threads, thread synchronization, file system access, streams, shared libraries and class loading, sockets and network protocols (HTTP, FTP, SMTP, etc.), and include an HTTP server, as well as an XML parser with SAX2 and DOM interfaces and SQL database access. The modular and efficient design and implementation makes the POCO C++ Libraries well suited for embedded development."

This kind of statement "The modular and efficient design and implementation" is often used to describe libraries even if they are bad designed.

I decided to verify if POCO is well designed and implemented as expected, for that i analyse it with CppDepend.

Here's the result of the analysis

and here's POCO general informations:

The first remark is that POCO is well commented.

Implementation

Number of line of code

Methods with many number of line of code are not easy to maintain and understand, let's search for methods with more than 60 lines.

SELECT METHODS WHERE NbLinesOfCode > 60 ORDER BY NbLinesOfCode DESC


Less than 1% of methods has more than 60 lines.

Cyclomatic complexity

Cyclomatic complexity is a popular procedural software metric equal to the number of decisions that can be taken in a procedure.

Let’s execute the following CQL request to detect methods to refactor.

SELECT METHODS WHERE CyclomaticComplexity > 20 ORDER BY CyclomaticComplexity DESC

So only 1% of methods can be considered as complex.

Which methods are complex and not enough commented?

SELECT METHODS WHERE CyclomaticComplexity > 20 AND PercentageComment

Methods with many variables

Methods where NbVariables is higher than 8 are hard to understand and maintain. Methods where NbVariables is higher than 15 are extremely complex and should be split in smaller methods (except if they are automatically generated by a tool).

SELECT METHODS WHERE NbVariables > 15 ORDER BY NbVariables DESC

only 8 methods has too many variables.

Types with many methods and fields

Let's sarch for types with many methods, for that we can execute the following CQL request

SELECT TYPES WHERE NbMethods > 30 AND !IsGlobal ORDER BY NbMethods DESC

Only 3% of types has many methods.

And we can do the same search for fields

SELECT TYPES WHERE NbFields > 20 AND !IsGlobal ORDER BY NbFields DESC

Less than 1% of types has many fields.

We can say that POCO is well implemented,few methods are considered complex,the types are simple with few methods and fields and it's well commented.

DESIGN

Abstract vs instability

The "Abstractness vs Instability" graph can be useful to detect projects that will be difficult to maintain or evolve.
This following post describe the utility of this graph and how to exploit it to improve the design.


For POCO here's the "Abstractness vs Instability" graph:

Only Fondation is inside the zone of pain , it's normal because it's very used by other projects.

inheritance

Multiple inheritane increase complexity ,and we have to use it carefully.

Let's search for class with many base classes.

SELECT TYPES WHERE NbBaseClass >1

The blue rectangles represent the result.



only few classes derived from more than one class.

Type cohesion

The single responsibility principle states that a class should have more than one reason to change. Such a class is said to be cohesive. A high LCOM value generally pinpoints a poorly cohesive class. There are several LCOM metrics. The LCOM takes its values in the range [0-1]. The LCOMHS (HS stands for Henderson-Sellers) takes its values in the range [0-2]. Note that the LCOMHS metric is often considered as more efficient to detect non-cohesive types.
LCOMHS value higher than 1 should be considered alarming.

SELECT TYPES WHERE LCOMHS > 0.95 AND NbFields > 10 AND NbMethods >10 AND !IsGlobal ORDER BY LCOMHS DESC

only 1% of types are considered as no cohesive.

Efferent coupling

The Efferent Coupling for a particular type is the number of types it directly depends on.
Types where TypeCe > 50 are types that depends on too many other types. They are complex and have more than one responsability. They are good candidate for refactoring.

Let's execute the following CQL request

SELECT TYPES WHERE TypeCe > 50 AND !IsGlobal ORDER BY TypeCe DESC

And the result is empty so no class has many responsabilities.

Types most used

It’s very interesting to know which types are most used,for that we can use the TypeRank metric.

TypeRank values are computed by applying the Google PageRank algorithm on the graph of types’ dependencies. A homothety of center 0.15 is applied to make it so that the average of TypeRank is 1.

Types with high TypeRank should be more carefully tested because bugs in such types will likely be more catastrophic.

Let's search for types most used and complex.

SELECT TYPES WHERE TypeRank >10 AND CyclomaticComplexity >20

The result is empty so no class is very used and complex.

Layering and level metric

This post explain the level metric and how to exploit it to improve design.

Let's search dependency cycles for that we can execute the following CQL request

SELECT METHODS WHERE !HasLevel AND !IsGlobal

only fews methods has dependency cycle, let's take for example the Zip project and look to its dependency graph

only 1 dependency cycle exist in this project.

Poco is also well designed it's high cohesive and low coupled.

No comments:

Post a Comment