While this list of SQL interview questions mostly helpful for Business Analysts who need to know little bit of SQL knowledge to face the interview and daily job duties.. Not every Business Analyst will have to use SQL. It depends upon company you are working.
Describe the following:
b) Acid properties
A transaction is a unit of work, usually logical, in which all steps must be executed, if any is omitted, the entire transaction is nullified.
The term ACID is an abbreviation of Atomicity, Consistency, Isolation, Durability
These constitute the properties of a transaction; more information regarding these properties is available in SQL (Structured Query Language) Server books online or in any Relational Database Management Systems (RDBMS)’s fundamental textbook.
What is meant by de-normalization and state its usefulness.
As suggested by the name, this is a process that reverses normalization process.
It involves introducing redundancy into the database in a controlled manner.
Its purpose is to improve the query performance as it reduces the number of joins.
How many relationships in SQL
There are three types of relationships in a relational database system, namely the one-to-one, one-to- many and many-to-many, briefly describe how you would implement each one of them when designing a table.
i) One to one relationship:
This is usually implemented in a single table and thus of less technical importance as compared to the others though it can sometimes be implemented as two tables with primary keys and foreign keys
ii) One-to-many relationship:
Such a relationship is implemented by splitting the data into tables thereby ending up with a primary and foreign key relationship.
iii) Many – to – many relationship:
This type of database relationship is implemented in the following way:
A junction table is created having the primary keys from both the related tables thus forming a common primary key for both the tables.
State the difference and similarity between unique and primary keys in a database.
The difference is that by default primary key creates an index column which is clustered whereas the unique key creates a non-clustered index under default situation. The second major difference is that primary key can not allow a NULL entry whereas the unique key just allows the NULL entries. Similarity is that both the keys are used to enforce uniqueness in particular columns.
Describe user-defined data types and their importance.
User-defined datatypes are datatypes created by the user from the primitive datatypes such as the int. Examples of user defined data types involve classes and structures.
They help one to extend the base Structured Query Language (SQL) Server datatypes in that they can provide a descriptive name and also format the database as per the user’s desire. Consider a database comprising of a column called Flight_Num appearing in many tables all of which are of varchar(8) i.e. variable character datatype of length eight in such a situation, one would do a user defined datatype and call it Flight_num_type of varchar(8) and use it all along the table.
Explain the meaning of bit datatype and state the kind of information that one can store in a bit column:
Bit datatype are datatypes of type bool, i.e. true or false or the logical 1 and 0 respectively, thus a bit column can only store the datatypes of this nature. SQL Server version 7.0 and above support a third form of bit datatype called the NULL datatype.
Explain the following types of keys as used in database systems: Candidate key, composite key and the alternate key.
Candidate key is the type of key used in identifying each and every row uniquely in a given table. It usually becomes the primary key of the given table, in a situation whereby there exists more than one candidate key, only one will be considered to be the primary key and the rest will then be called the alternate keys, a combination of these keys, i.e. the candidate as well as the alternate keys is referred to as the composite key.
Describe the term ‘default’ and state whether or not there is a column in which defaults can not be bound.
A default is a value which will be automatically adopted by the column under circumstances whereby the user has not provided any value.
Columns such as IDENTITY column and the TIMESTAMP column can never have default bound to them
What is transaction and its ACID properties.
A transaction is a unit of work, usually logical, which have steps that all have to be performed, in an event that some steps of a transaction are omitted, the entire transaction is nullified. The term ACID stands for Atomicity, Consistency, Isolation, Durability, which are the terms constituting the properties of a transaction.
What is Lock Escalation.
Escalation is the process of converting high volume of low level locks e.g. row and page locks into high level locks such as table locks.
It is worth noting that every lock constitutes a memory structure thus many lock would mean high use of memory.
To treat the above, SQL Server escalates the many fine grain locks in order to yield few coarse grained locks thereby saving memory and yielding into efficient resource utilization.
What is the difference between DELETE and TRUNCATE table commands.
The difference is that DELETE TABLE is a logged operation, meaning that each and every row deleted ends up getting logged in transaction log making it relatively slow whereas in the case of TRUNCATE TABLE, there is no logging of each row and as a result it is faster, here, there is just de-allocation of data pages for the table. In the case of TRUNCATE TABLE command, ACTION ROLLBACK can take place thus an added advantage.
What is SQL constraint and also explain different types of constraints.
Constraints are limits used in the Relational Database Management System to implement integrity of the database without human intervention (automatically) or without even the need for the user triggers, rules or defaults. The following are the available types of constraints: CHECK, UNIQUE, NOT NULL, FOREIGN KEY, and PRIMARY KEY.
Describe the term index naming the available parts (of index). Also state the number of clustered indexes that can be created on a given table and then state the advantage and disadvantages of an approach which involves creation of a separate index on each column in a given table.
The term index as referred to in SQL Server reserves the meaning as to the way it is used in a book; they help in quickening the rate of information/data retrieval from the server.
There exist two types of indexes in a SQL database system, namely the clustered and the non-clustered indexes.
The difference is that when a clustered index is put on a table, all the rows in this table are stored in the order of the given clustered key index, it is worth noting that only one clustered index can exist in any table. Non-clustered indexes have their own storage separate from the storage for the table data.
Query performance is improved when one creates an index on each and every column of the table. Here, the query optimizer is in a position to choose from all the existing indexes so that it may come up with an efficient execution plan but operations involving data modification e.g. the DELETE, UPDATE or INSERT becomes slower as it is observable that all the indexes are to be updated each and every time data changes in the table .Also one uses more disk space as the number of indexes is usually directly proportional to the amount of disk space used thus another disadvantage.
Steps necessary to improve performance of query if at all the performance is found to be low.
Usually, so many reasons can lead to a low performance for queries and every reason has its own cure, the most common ways to solve such a problem is to ensure that all indexes, tables scans are available and that statistics are updated to the date, also one should try to ensure that there exist no blocking, no excess recompilation of stored procedures, try ensure that there is no procedures and triggers without SET NO COUNT ON, eliminate all the queries which have been poorly written especially those without necessary joins, reduce the amount of normalization as well as excess use of temporary tables and cursors.
The following are methods that can help one in troubleshooting a performance problem:
SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SET SHOWPLAN_ALL ON, WINDOWS NT/2000 Performance Monitor, Graphical Execution Plan in query analyzer, SQL Server profile
Explain the steps necessary to ensure security in a SQL server.
There are so many things one could talk about in an attempt to answer this question, these include the preference to NT authentication, use of server and database as well as application roles to get assistance in controlling access to the data, using the New Technology File System (NTFS) permissions to physically protect database, using a hard to guess SA password, creating a physical perimeter around the server to physically restrain unwanted access, doing away with or simply disabling guest account, renaming the administrator account on the SQL server, using a high level of encryption such as the multi-protocol encryption, putting up an SSL, isolating the database server (SQL Server) from the Web Server, making use of firewalls, advising users on the importance of server security, detecting the potential threats and addressing them in time, having backups to lower downtime in event of successful attacks, ensuring a hierarchy to the users so that less trusted users have less only access to less sensitive information, etc.
Distinguish between deadlock and lovelock stating clearly how one would go about the deadlocks.
Deadlock is a situation whereby two or more processes are not able to carry on with their operations because they are waiting for data which is being held by other processes waiting for some other data or recourses held up elsewhere too. Each process will have to wait for an indefinite period of time unless an external intervention is brought upon by forces outside the deadlock, e.g. by the help of an efficient algorithm or so, usually in an SQL server, this may involve terminating one of the user�s processes to allow the rest of the processes carry out the relevant operations with the required resources then leave it for the other processes in turns.
A live lock on the other hand is the situation whereby there exist repetitive denials for a request regarding an exclusive lock. This happens due to the fact that there is a series of overlapping shared locks which keep interfering. SQL server has a mechanism to detect such a situation, usually when it happens more than four times and thereby curing the situation by refusing any further shared locks.
A live lock under normal circumstances takes place when a table or a page is monopolized by a read transaction thus forcing the write transactions to wait indefinitely.
Write a short notes regarding the CREATE DATABASE syntax.
By impulse, many people just create databases using the Enterprise Manager or by simply entering the CREATE DATABASE command, this method does not take into account a situation whereby one needs to create a database with more than one file groups, say, even two, whereby one is on the drive C and another on the on a separate drive, say, D with a log on drive E having an initial size of say 600 MB and growing at a factor of 15%. Due to this, database administrators needs to be very familiar with the CREATE DATABASE syntax.
What is Restarting an SQL server in a Single User Mode and Starting a server in Minimal Configuration Mode?
Like many programs, SQL too can be started from the command line, go to start, run, then type in SQLSERVER.EXE, here exists some parameters which the database administrator is supposed to be familiar with including, -m used to for starting the SQL Server in single mode as asked in this question and �f used to start the SQL server in minimal configuration mode.
Describe the term statistics, stating under what circumstances they go out of date and how to correct this situation, i.e. update the out of date statistics.
Statistics is a set of parameters obtained and analyzed from a given set of data; usually used in determination of selectivity of indexes, the selectivity is done in a manner that it is more to the indexed columns with unique values as opposed to the index columns without unique values. Indexes are also used by the query optimizers to determine whether or not to choose an index while executing the given query.
The following are the situations under which it is necessary that one updates statistics.
1) When upgrading database to the next version.
2) When there is a change in the distribution of key values, i.e. when there is addition, change or removal of a large amount of data in an indexed column or when there is truncation of the table using the TRUNCATE TABLE statement and then a repopulation.
3) When there is a notable change in the key values in the given index.
State the ways one can use to move data or databases between different servers and also databases just within the SQL Server.
So many ways can lead to the solution of this problem and thus it is upon the user�s discretion to choose the way forward depending on the problem pertinent to his situation.
Available alternatives include BACKUP/RESTORE, detaching and attaching databases, DTS, replication of the database, BCP, using the INSERT-SELECT statement, using the INSERT-INTO statement, using or creating INSERT scripts to generate data.
Define database replication, state different kinds of replication one can set up in the Structured Query Language (SQL) Server.
Copying or moving data over the same or different SQL Servers is what is referred to as replication.
The following are types of replication scenarios supported by the SQL server
a) Merge replication
b) Transactional replication
c) Snapshot replication
Describe the term cursor, and then write short notes about cursor regarding types of cursor, disadvantages of cursors and how to avoid cursors.
Cursors are attributes which allow that allow processing of result sets in a row by row format. The following types of cursors do exist:
– Static cursors
– Dynamic cursors
– Key-set driven cursors
– Forward only cursors
Disadvantages of cursors:
There is a result into a network roundtrip every time one fetches a row from the cursor yet a normal SELECT query makes only one roundtrip even if the result set is very large.
They are costly since they consume more memory for storage.
Again, case of memory results into consumption of processor cycles as this involves more use of the input output operations. Not all types of cursors can be used with the SELECT statement fully; there exists restrictions regarding this concept.
In many occasions, it is usually advisable that one substitutes the use of cursors with the use of set-based operations as can be illustrated below:
Assuming one gives a flat hike to his employees basing on the following criteria
Salary between 100 and 200 — 50hike
Salary between 200 and 300 — 60 hike
Salary between 300 and 400 — 70 hike
Many developers would tend to use cursors in such a situation whereby they would base on the above formula to determine the employee�s salary and update it using the same formula.
Multiple update statements too can be used to get the same result, or one can have the multiple updates combined into a single update to yield the same results as follows:
UPDATE tbl_emp SET salary =
CASE WHEN salary BETWEEN 100 AND 200 THEN salary + 50
WHEN salary BETWEEN 200 AND 300 THEN salary + 60
WHEN salary BETWEEN 300 AND 400THEN salary + 70
Cursors also become misleading to the developers in the following situations:
When it is necessary that one calls a stored procedure at the time when a certain condition is met by a particular row, one doesn�t need to use cursors as may would for this and instead would achieve the same results by just using the WHILE loop as long as he has ascertained the existence of a unique key that identifies each row.
In an OLTP system, there exists a trigger which is defined for the operations regarding the INSERT, which is usually written to provide an instantiation to the COM object and passing the newly inserted rows to object. This is done with an intention of bringing about customized processing; do you think there is a better way of implementing this?
It is worth noting that instantiating a COM object would generally slow the data insertion process since this process is time consuming in nature and it is usually done from within a trigger. This happens in a way similar to a situation whereby one is sending an email from within the trigger. One better way to implement this scenario is for one to log all the data found to be necessary into a single separate table, and then find a job to be checking this table periodically so as to take an appropriate action.
Explain the meaning of the phrase �Extended Stored Procedure� and then comment as to whether it is possible for one to instantiate a COM object with the use T-SQL (T Structured Query Language).
An extended stored procedure is a method or a function within a Data Link Library (DLL) usually written in a high level programming language such as Java, C, C++ with the use of the so called the Open Data Services (ODS) Application/Abstract Programming Interface (API) which can be called from the T SQL in the same way as we call the normal stored procedure with the help of the EXEC statement.
Explain the meaning of the word join giving the different types of joins.
Join is a word used to explain the relationships within different kinds of tables.
It can also assist a user to select a data in a particular table based on the criteria or data specified in another different table.
The following is a list of possible types of joins:
– Inner joins
– Cross joins
– Outer joins � further classified into left outer, right outer and the full outer joins