November 5, 2012

Transaction Auditing in Databases: A Survey and Assessment

Today, most organisations own databases with enormous amounts of highly sensitive data. These databases hold sensitive information of the companies such as financial records, employee salary figures and other sensitive data. Due to frequently reported cases of fraud and data theft it has become a top priority for the DBAs to secure their databases. Database auditing involves the logging, analyzing and reporting of any malicious transaction that has or is happening on the database. DBAs adopt different auditing strategies and techniques depending on their policy requirement and resources available. We will discuss these techniques and find out how they address the databases’ security needs.

1. Introduction

The last two decades saw an exponential growth in internet and information system usage. This growth brought with itself new security concerns since a considerable amount of the data now being made available on the web is sensitive. Millions of transactions take place on an organization’s online system every week. All organisations need to secure their systems to ensure the integrity, availability and confidentiality of data. While most of the efforts are made to prevent unauthorized access from outsiders, reports reveal that insider attacks cause equally severe damage to organizations’ data and assets. Such requirements are driving the growth of new and improved softwares, methods and techniques.
One of these techniques is database auditing. Auditing has long been recognized as being a critical element in secure systems. Using the audit data, it is possible to reconstruct events in a system. Therefore, auditing can be used to provide accountability for all user actions, including those affecting security. [1] The development of techniques to audit information systems arises from the need to trace access to sensitive or important information stored on these systems. Logging records events or statistics to provide information about system use and performance. Auditing analyzes these records to present information about the system in a clear and understandable manner. [2] But for a successful logging and auditing one has to know what information to log, and what information to audit. These constitute the auditing policies which may differ from organisation to organisation.

A good database access auditing solution should be able to provide answers to the following questions:

1.       Who accessed the data?
2.       When (date and time)?
3.       From where (IP address)?
4.       What query was issued to get the data?
5.       Was the request successful? (If yes) how many rows of data were retrieved?
6.       What data was changed?
One of the biggest problems with existing audit strategies is performance degradation. The audit trails that are generated must be detailed enough to capture before-and-after values of database changes. But capturing so much information, particularly in a busy system, can cause performance to suffer. Therefore, a good auditing facility must allow for the selective creation of audit records to minimize performance and storage problems.

The objective in this article is to discuss the various approaches and techniques used in auditing transactions in databases. Section 2 gives an overview of basic designing of auditing techniques. The next section discusses the merits and demerits of different auditing approaches like server side and client side auditing. Section 4 discusses the auditing techniques in detail with each of their pros and cons. Role of triggers in auditing databases has been discussed in section 5 along with brief descriptions of DDL triggers and Oracle FGA. Securing audit logs is the last and one of the most important part in database auditing and has been discussed in section 6 of this paper. 

2.  Basic Design of Auditing System

The implementation of most of the auditing strategies requires the design and implementation of effective logging and the traceability of events taking place in the information system. Auditors need to know the data accessed and the action done on these data to prepare him against a possible tampering of data. This can be achieved through analysing the audit trail records. Audit trails contain data in structured format, with audit information presented as a set of time-stamped events with other user identifying attributes. This information can be easily analyzed using any available database reporting tools. The first and foremost question is on which tables of information and which events the audit trail should be applied because it is important to keep the size of the audit log to the minimum possible. For this reason, normal events such as insertion and review of records are not recommended for an audit logging.

The audit logging needs to be more innovative in design so that it is easy for the auditor to analyze. It should give proactive signals and help prevent security incidents rather than throw a heap of information on auditors/security administrators.

While logging the modifications, there can be two modes:
·         Horizontal logging—Log the entire row as available before any modification.
·         Vertical logging—Log only the columns that have changed during the operations.
Both have their advantages and disadvantages.

2.1 Horizontal Logging

As stated before, this design logs the entire row of data as it was before undergoing any modification or deletion. This should be used only when the change per record (changed bytes/record size) is large else it would waste the disk space unnecessarily. However, it is ideal for deleted record logs because the entire record needs to be logged. Due to large data per record (the complete row) it becomes difficult to point out the change. Hence it is difficult to input auto alerts based on the nature or frequency of changes, since change cannot be analyzed easily. Detecting and reporting only the changes is a difficult process in horizontal logging.

Fig 1. Horizontal Logging – Entire row is logged.

2.1 Vertical Logging

This logs only the columns that have changed during the modification. This should be used when the change per record (changed bytes/record size) is less and hence proves to be economical as disk space used is less. It is quite easy to point out changes in this case since vertical logging maintains only the changed items and hence it is also simple to generate output. Auto alerts can be easily raised in this case. For example, an alert would be raised if data in the columns like name or salary is changed twice or more times in a day.

Fig 2. Vertical Logging – Only the required column is logged.

3. Auditing Approach

Auditing of the database can be performed within the client (client side), within the database (server side) or between the client and the database. Auditing within the client involves using the client’s database access tools to obtain a list of activities that the user has performed through the database. In order for the auditor to obtain a meaningful audit trail, all transactions that occurred would have to have occurred through these client tools. It is possible that the user uses any tool other than these for carrying out transactions on the database. Therefore this option is the worst option available to the auditor.

The second option involves server side database auditing. This approach brings with itself several advantages. Server-side auditing is the only method allowing auditing of every type of database access by any type of user, regardless of whether users are network based or local to the server. Server-side auditing can also audit all types of local access to the database server performed via remote system access protocols such as Telnet, SSH etc. It also allows auditing user activities performed using encrypted network protocols (SSH, SSL etc.) and encrypted database client-server communication protocols (Oracle TCPS, MySQL compression etc.). All these encrypted database communication protocols can be audited just as well as regular non-encrypted protocols. Server-side auditing records data to audit log tables or optionally to external files in a real-time and the recorded data is immediately available for use in reporting and alerting processes. Where as in the client-side approach reports and alerts can not be made available to the user in real time easily. Server-side auditing is the only solution for monitoring every type of back-door access to the server and the data. No changes or functionality sacrifices are required on the client-side in any existing applications and systems when server-side auditing is deployed.

But like everything there are drawbacks to this approach, primarily due to the limited audit functionality of DBMS, the inconsistency across DBMS types and the performance penalty that occurs when the audit mechanism is enabled. Every additional processing on the database server such as auditing creates additional input output operations and requires additional CPU resources. But if the auditing is setup properly and focused on what is really needed the performance degradation would be really negligible and that too will be offset by the benefits offered by the server-side auditing. Auditing within the database is certainly better than auditing within the client.

The third and next approach is auditing between the client and the database. It involves auditing the communications between the client and the database. The data packets between the client and server are captured, interpreted and then audited. This process can be accomplished using third party commercial software.

4.  Implementation Techniques

The most basic aim of auditing is to keep a track of all the modifications taking place on the database. Here modification refers to any insertion, deletion and updating of records in the database. This may also include any change in the audit policy of the system. There are several approaches which may be implemented to detect modifications in data records. For choosing the technique to be implemented in a particular situation one has to consider each technique’s advantages, disadvantages and the existence of the necessary features in the operational database. We would refer to the table upon which auditing has to be performed as the operational table. In the following section we will discuss some of the most used implementation method for auditing transactions in databases.

4.1 Logging the Entire Modified and Original Row into a New Table

Method: Create a table audit_<tablename> where <tablename> is the name of the table in operational database which has to be audited for modifications. This table has the same structure as that of the operational table including exact integrity constraints, except for the primary keys in some cases. This table includes some additional fields like timestamp, user name and a column identifying the change operation which was performed. Whenever any change occurs on any field in a row, the entire original and new values of the fields in the row are logged into the audit_<tablename> table. This method generally employs the use of triggers.
Necessary features: The DBMS must support triggers. Enough disk space should be available as the entire rows are being logged.
Advantages: It is assured that the entire record is mapped into the log even when all the values of a row changes.
Disadvantages: Causes overload on server as the entire row is first read from the old table and then written again with the changes into the audit table. Server performance degrades further if triggers are used for this purpose. Requires large storage space on disk which becomes uneconomical if unwanted data are also logged. Requires frequent cleaning of the audit log as large amount of data gets piled up very quickly.
When to implement: When logging the entire fields in the row becomes really necessary. Eg. Logging deleted entries of a table. Also when there are very less number of fields in a table one can go for this method as it won’t affect the disk space much. Eg. A table containing only the following columns: CUST_ID and ACC_NO.

4.2 Logging Only the Modified Data into a New Table

Method: Create an audit table containing the following fields: DBNAME (name of the operational database), TABNAME (name of the operational table),
COLNAME (name of the column where the target data to be modified resides), PKIDREC (Primary key value of the record where the target data to be modified resides, provided the primary key of the table is never modified). These fields are required to reach at the location of a particular data on the database. Other fields in this table would include ‘old_value’,     ’new_value’, ‘user_name’, timestamp and a column identifying the change operation which was performed.
Whenever any data is changed in the operational table the old and modified values are logged into the audit table along with the user name, timestamp etc. The other fields in the audit table give the exact location of the data which was modified. Generally this method too employs the use of triggers.
Necessary features: The DBMS must support triggers.
Advantages: It logs only the necessary (original and modified) data and hence utilizes lesser disk space than that in the previous method and hence it becomes easier to identify the changes at one glance. This makes audit reporting easier. Doesn’t causes much overload on server except when the triggers (if implemented) are called.
Disadvantages: If auto increment index is not present in a table there should be one such field in the table which can never be modified. This field indicates the position of the row in the table.
When to implement: When there are large number of fields in a table and changes are occurring on only a few columns and it is not feasible to log the entire row because of disk space limitation.

4.3 Adding an Extra Column on the Operational Table for Indicating and Storing the Modification.

Method: Create some extra columns in the table identifying the last change operation performed and when it took place and by whom.  
Necessary features: The DBMS must support triggers.
Advantages: Decrease in the need for storage space. Very less server overload as the process requires only shifting the data from one column of the table to another column in the same table.
Disadvantages: The entire record of changes taking place in the operational table is not stored, only the last change is logged.
When to implement: When the logging of all changes performed in operational database is not necessary, only the last change is required or only one or two columns of a table is to be audited and very few changes take place in them in a given period of time.

4.4 Auditing from Database Transaction Log.

Method: Transaction logs are created by database engines to assist with backup recovery. Typically when a database restore is requested, the database administrator first restores the last full backup of the database and then incrementally applies the transaction log backups until the desired time of restoration is accomplished. Since these transaction logs track the changes to data, the transaction logs can additionally be used for audit purposes provided the user has access to a transaction log viewer.
Necessary features: The DBMS should have implemented the transactions log control. The translation from the DBMS transactions log of SQL commands for insert, update and delete should be possible.
Advantages: There is no overload on the operational database.  Full record of all the transaction is available.
Disadvantages: Requires large disk space. Transaction log viewers are merely report data that were created solely for backup recovery and is therefore limited in its ability to provide selective auditing of data changes within a specific set of criteria.  When the size of transaction log crosses its limit cycling of transaction log takes place and these results into transaction lost from the log. Thus there is need for controlling the transaction log area size by the DBA. There is always a possible risk of lost of changes because of human, physical failure or by an unexpected operation by the DBA.
When to implement: When the mapping of all or part of the changes occurred in the operational database is necessary

4.5 Auditing Using Database Snapshots

Database Snapshots allows you to create a read-only copy of your database that can be used for other purposes, such as reporting, auditing or recovering data.
Method: When you create a snapshot, an empty structure of your database file is created. When data is read from the snapshot, it is actually being read from the operational database. As data changes in the operational database DBMS writes the original data (data before modification) into a spare file. The snapshot will then read that spare file for the data that has been changed and continue to read the primary database for data that has not changed.  Using the snapshot is just like using any other database. You log in to the database and execute your queries.
Necessary features: Database Snapshot feature must be available in the DBMS.
Advantages: The initial creation of the snapshot or subsequent snapshots is extremely fast. As the data changes it gets written to a spare file, so the overall disk space don't double each time a new snapshot is created. Several snapshots can be created for the same database, so snapshots can be taken frequently for auditing purposes.
Disadvantages: Snapshot is a read-only copy, so you can't issue UPDATE, DELETE or INSERT statements to it. The snapshot is tied to the operational database, so the operational database cannot be taken offline. If it goes offline, you won't be able to access any of the snapshots related to that. If data changes rapidly, this method may not offer benefits for auditing purpose.
When to implement: When the DBMS does not support triggers or when the implementation of triggers is not allowed or where there is no transaction log in DBMS or where there is no possibility for translating the transaction log.

5. Role of Database Triggers in Auditing

As discussed in the previous section triggers play a very important role in implementing database auditing.
In this section we will discuss major advantages and disadvantages of using triggers in the context of auditing. A trigger is a named database object that is associated with a table and is activated when a particular event occurs for the table. A trigger-based auditing approach captures the change at the source, hence the recording of old and new values is guaranteed. Other than this there are several other advantages of using triggers. Triggers can improve performance slightly because of the reduction in the number of messages passed from the client to the database server. For example, if the trigger fires five SQL statements, the client saves at least 10 messages passed between the client and database server (one to send the SQL statement and one for the reply after the database server executes the SQL statement). Triggers improve performance the most when they execute more SQL statements and the network speed is comparatively slow.

While logging the modifications done on the operational database to an audit database it is very important to maintain a sequential index of all the records. Any row of record can then be referred by this value. Generally this index is generated automatically by using the auto increment feature and hence it is also the primary key for the audit tables. Using a database trigger to populate a primary key with a sequence value - or any unique value for that matter - is the right thing to do. Depending on the APIs to generate primary keys is a dangerous approach. For example, if a developer decides to manually insert rows into the audit table (by bypassing the APIs) he queries the last number from the table’s primary key, and then inserts rows with hardcoded numbers in the key. He adds 1 to the number each time he inserts a new row. He does not use the auto incremented next value. The next time the API is called to create a new row, the sequence generates its next number, but this is the same number that the developer has already used. Unique constraint is violated. An error will be reported and the transaction will abort. On the other hand if the primary key is populated through a trigger this error would not happen as every time transaction takes place whether through an API or manually the primary key gets incremented.

Triggers run every time when the database fields are updated and it is overhead on system. This fear of overhead leads DBAs to minimize the number of modifications recorded or the period over which they are recorded.

Another drawback of triggers is that they cannot capture data views or changes to schema and permissions. Data manipulating statements such as INSERT, UPDATE or DELETE can be captured by using normal triggers. Because simple SELECT statements are non data manipulating, they neither fire a trigger nor go into archived logs that can be mined later, so triggers fall short where SELECT statements are concerned. Some DBMSs have come out with new techniques to overcome this problem. Oracle9i Database introduced a new feature called fine-grained auditing (FGA). This feature provides the functionality to audit individual SELECT statements along with exact statements issued by users. In addition to simply tracking statements, FGA provides a way to simulate a trigger for SELECT statements by executing a code whenever a user selects a particular set of data. Apart from the conventional DML triggers Microsoft SQL Server provides the feature of DDL Triggers. DDL Triggers are a special kind of trigger that fire in response to a variety of Data Definition Language (DDL) events. These events primarily correspond to Transact-SQL statements that start with the keywords CREATE, ALTER and DROP. They can be used to perform administrative tasks in the database such as auditing and regulating database operations.

Hanson and Noronha [3] propose a simple model for a time-driven triggering and alerting system. Such a system can be used with relational and object-relational databases systems. They are relatively easy to implement since they can be built using a middleware program that simply runs SQL statements against a DBMS. Also, they can check certain types of conditions, such as "a value did not change" or "a value did not change by more than 10% in six months."
In cases where it may be perfectly reasonable to be notified only once an hour, once a day, or once a week when an interesting change occurs this approach can be found useful. This is especially true when we are interested to audit the change on an aggregation, such as a sum, average or count.

6. Security of Audit Logs

In order to implement a best possible auditing solution it is important to focus on the security of audit logs. The audit logs must be protected by employing techniques based on cryptographically strong one-way hash functions that prevent an intruder, including an auditor or an employee or even an unknown bug within the DBMS itself, from silently corrupting the audit log. The DBMS should store additional information in the database to enable a separate audit log validator to examine the database along with this extra information and state conclusively whether the audit log has been compromised. [4] Furthermore the following measures can be taken to secure the audit logs against tampering: The access rights to audit logs should not be with the operations department or the supervisor/administrator. They should be accessible only to the internal/external auditors. The audit log table should have an audit log on itself to record any tampering.

7. Conclusion

The need for auditing data changes has increased with the increase in the volume of transactions and the opening up of organizations’ data to the external world. The increased volume of transactions has further posed a challenge for the selective logging of possible security events. In this paper various design and implementation of auditing techniques have been discussed. The advantages and disadvantages in each technique have been mentioned clearly along with the scenarios under which these techniques could be used. The auditor must keep in mind his requirements while selecting any of these methods for auditing. Triggers play an important role in adopting many of these techniques. Triggers cause only a slight performance impact that can be offset by the decreased number of messages passed between the client and the server.


[1] Kogan B.,  Jajodia S., “An audit model for object-oriented databases”, Computer Security Applications Conference, 1991. Proceedings, Seventh Annual.  
[2] Bishop et al., "Goal-Oriented Auditing and Logging", submitted to IEEE Transactions on Computing Systems, 1996.
[3] Eric N. Hanson, et al., Timer-Driven Database Triggers and Alerters: Semantics and a Challenge, SIGMOD Record, Dec. 1999, pp. 11-16, vol. 28 No. 4.
[4] Richard T. Snodgrass, Stanley Yao and Christian Collberg, "Tamper Detection in Audit Logs," In Proceedings of the International Conference on Very Large Databases, Toronto, Canada, August–September 2004, pp. 504–515.

Need for multi core processors

Today’s world is ripe with computing technology. Computing technology is all around us and it’s often difficult to decipher what is actually being offered at any given moment with any given advancement. Oftentimes advancements can come so quick that we do not take the time to understand the importance it has on our fields.
One such advancement in computing technology is Parallelism. Parallelism is defined by the execution of many computing calculations carried out simultaneously. The foundation behind parallelism is that large calculations can be completed quicker if they are divided into smaller calculations and all solved in parallel, hence the root definition of the word. Parallelism has just recently achieved mass popularity in the technology industry long after the advent of the microprocessor.
Recent physical constraints like power delivery and heat dissipation, and the need for higher clock frequencies have made single-core processors outdated. Parallel computing capabilities enabled in common software applications coupled with advances in microprocessor technologies have made multi-core processors the preferred architecture for today’s PCs. Multi- core processors standout and are differentiated by their core-count, or the amount of physical cores contained on the processor die itself. Core-count differentiates processors and is often used in the nomenclature: e.g. "Intel 2.6 GHz LGA 1155 Dual Core G620 Processor " and “AMD Phenom II X4 quad-core processor”.
Parallelism and multi-core processors go hand-in-hand. When large calculations are divided into many smaller calculations it is best to have dedicated technology resources ready to work on each of those smaller calculations simultaneously. This is where the performance of physical multi- core processors really begin to shine, and where multi-core processor technology shifts from a want to have to a must have.