Monday, September 25, 2006

Should you use Directory or RDBMS?

In most of the development project we use relational databases. But if you are working on a project which uses directory services (like Active Directory or Sun One etc.), then chances are high that you would end up in comparing directories versus relational databases. If you have been using RDBMS frequently then you are definitely going to apply same logic for directories as well. And this is not at all a good idea.

Alternatively, consider this scenario. You are in the middle of designing one application, when suddenly you come with this question. “Do I need to use Directory or Relational Database here?” Well, this is not really a difficult question to answer. But one needs to have thorough understanding of both the directories and relational databases to answer above mentioned question. If you go ahead with inadequate understanding of either of them, then it can bring your application down on the knees.

It is very important to understand the similarities and differences between these two. This blog is just an attempt to resolve few doubts with respect to directories and RDBMS. I hope that next time you face situations as mentioned above; you are better equipped to take the call.

First of all, let us try to understand what do you mean by a Directory.

Ø So, what do you mean by a Directory?
A directory is a specialized database specifically designed for searching and browsing of information. Typically, it stores typed and ordered information of objects. Directories are tuned for better read performance. So, it typically performs more read operations than write operations. That does not mean that you cannot perform write operations.

So, now the question is how is it related to LDAP? Well, LDAP (Lightweight Directory Access Protocol) is an open-standard protocol for accessing X.500 directory services. The protocol runs over Internet transport protocols, such as TCP. The LDAP Standard consists of schema definitions, LDIF file exchange formats and definitions for some object classes. If directory is LDAP-compliant, then it can interpret and respond to LDAP request from LDAP client applications. In a network, a directory tells you where in the network something is located. On TCP/IP networks (including the Internet), the domain name system (DNS) is the directory system used to relate the domain name to a specific network address (a unique location on the network). However, you may not know the domain name. LDAP allows you to search for an individual without knowing where they're located (although additional information will help with the search).

Most of us have used Microsoft Active Directory or Sun One Active Directory. Now, you can relate the definition of directory to above mentioned directories.
Okay, we know the definition of the directory. Now, what are the typical characteristics of directories?
1. Static Data:
The data stored in the directory is not really subjected to change or frequent modifications.
2. Hierarchical:
It is capable of storing objects in a hierarchical fashion for organization and relationship. An LDAP directory is organized in a simple tree hierarchy consisting of levels like: root directory ->countries ->Organizations within those countries-> Organization units- >Individuals.
3. Standard Schema:
It uses standard schema, which is available to all applications making use of it.
4. Object-oriented:
It represents entities and objects. Objects are derived from objectclass and are collection of attributes.
5. Multi-valued attribute:
The attributes can have multiple values.
6. Distributed:
It is distributed in nature. It can be distributed among many servers.
7. LDAP Protocol:
This lightweight protocol is used to access the directories.
8. Transactions are not supported.
It does not support transactions. But if you want you can create custom transaction management with your client application.


Ø What are the characteristics of RDBMS?
Most of us are aware of RDBMS concepts along with Codd’s 12 rules. SO, instead of giving details of it, let’s focus on characteristics of it as compared to Directory.
1. Dynamic/ frequently changing Data:
The data stored is frequently updated. There are more write/update operations. Alternatively, it can be also used to store vast amount of historical data. It can be later used for data mining or creating data cubes. This is really useful for business intelligence.
2. Relational:
Data is stored in the form of rows and columns or in the tabular format.
3. Custom Database Schema:
The database schema is specific to applications. It can be anything from the simple schema to star schema, snowflake schema etc.
4. Complete Data Models:
It typically uses complex data models with various tables, key constraints, join operations etc.
5. Transactions Supported:
It supports transactions and thus follows the ACID properties of a transaction.
6. Data Integrity:
It uses many complex models for data integrity right from transaction rollback, referential integrity etc.
7. SQL:
One can make use of SQL to fire various select/insert/update/delete queries against the data stores. It also allows us to make use of stored procedures, views and triggers etc.
Now, having seen the characteristics of both directories and RDBMS, one might think that is it possible to have marriage of these two? Answer to this question is YES. You can have LDAP directory as an application running on top of RDMBS. For example: Oracle and IBM provides this. Now, is this a good idea or not? Well, to answer this question, I will have to change the focus of this blog. And I do not want to change the focus. I will try to address this question in some other blog.

Ø Fine, so in which scenarios I should use Directories?
Based on the above information, now you are in a better position to judge when to use directory and when to use RDMBS. Still, what are the common scenarios where in it is recommended to use directories?
Security:
It allows security down to the attribute level. Many Directory-enabled applications are available to extend directory security mechanisms. You can also use security features to enable access to particular resources. Once this security rules are defined, then do not change frequently. That is the reason Identity Management applications use LDAP directory extensively.
Most of the RDBMS offer column-level security, but the advanced security solutions available with directories are far more flexible and granular.
Data:
It is really important to know what kind of data you are going to handle. Based on its characteristics, you can figure out the solution. For example, if data is going to be static most of the time with less write operations or it is going to contain multi-value attributes. You do not need transactions. Then you know that directory can be best suited here.
Business Requirement:
This is the most important factor on deciding the solution. Based on the requirements, cost saving options you can go for either of these solutions.

Cheers,
Amol.

No comments: