Friday, February 19, 2010
Convert multiple rows into one row - SQL Server
Method 1:
DECLARE @str varchar(4000)
SET @str = (SELECT CONTACT_EMAIL + ';' FROM table FOR XML PATH(''))
SET @str = SUBSTRING(@str,1,LEN(@str)-1)
SELECT @str
Method 2:
DECLARE @str varchar(4000)
SELECT @str = COALESCE(@str + ';', '') + CONTACT_EMAIL FROM table
SELECT @str
Method 3:
DECLARE @str varchar(4000)
SELECT DISTINCT STUFF( (SELECT CONTACT_EMAIL + ';' from table FOR XML PATH('')),1,1,'')
SELECT @str
Multiple rows returned:
CONTACT_EMAIL
abc1@domain.com
abc2@domain.com
abc3@domain.com
3 row(s) affected.
After executing one of the methods, i got the result as
CONTACT_EMAIL
abc1@domain.com;abc2@domain.com;abc3@domain.com;
1 row(s) affected.
NOTE: I tried these methods in SQL SERVER 2005. Some of them might not work in earlier versions.
Thursday, February 18, 2010
MVC Overview, Agile, Methodology, ADO.NET Entity Framework
ASP.NET MVC Overview (C#)
The Model-View-Controller (MVC) architectural pattern separates an application into three main components: the model, the view, and the controller. The ASP.NET MVC framework provides an alternative to the ASP.NET Web Forms pattern for creating MVC-based Web applications. The ASP.NET MVC framework is a lightweight, highly testable presentation framework that (as with Web Forms-based applications) is integrated with existing ASP.NET features, such as master pages and membership-based authentication. The MVC framework is defined in the System.Web.Mvc namespace and is a fundamental, supported part of the System.Web namespace.
MVC is a standard design pattern that many developers are familiar with. Some types of Web applications will benefit from the MVC framework. Others will continue to use the traditional ASP.NET application pattern that is based on Web Forms and postbacks. Other types of Web applications will combine the two approaches; neither approach excludes the other.
The MVC framework includes the following components:
Models. Model objects are the parts of the application that implement the logic for the application's data domain. Often, model objects retrieve and store model state in a database. For example, a Product object might retrieve information from a database, operate on it, and then write updated information back to a Products table in SQL Server.
In small applications, the model is often a conceptual separation instead of a physical one. For example, if the application only reads a data set and sends it to the view, the application does not have a physical model layer and associated classes. In that case, the data set takes on the role of a model object.
Views. Views are the components that display the application's user interface (UI). Typically, this UI is created from the model data. An example would be an edit view of a Products table that displays text boxes, drop-down lists, and check boxes based on the current state of a Products object.
Controllers. Controllers are the components that handle user interaction, work with the model, and ultimately select a view to render that displays UI. In an MVC application, the view only displays information; the controller handles and responds to user input and interaction. For example, the controller handles query-string values, and passes these values to the model, which in turn queries the database by using the values.
The MVC pattern helps you create applications that separate the different aspects of the application (input logic, business logic, and UI logic), while providing a loose coupling between these elements. The pattern specifies where each kind of logic should be located in the application. The UI logic belongs in the view. Input logic belongs in the controller. Business logic belongs in the model. This separation helps you manage complexity when you build an application, because it enables you to focus on one aspect of the implementation at a time. For example, you can focus on the view without depending on the business logic.
In addition to managing complexity, the MVC pattern makes it easier to test applications than it is to test a Web Forms-based ASP.NET Web application. For example, in a Web Forms-based ASP.NET Web application, a single class is used both to display output and to respond to user input. Writing automated tests for Web Forms-based ASP.NET applications can be complex, because to test an individual page, you must instantiate the page class, all its child controls, and additional dependent classes in the application. Because so many classes are instantiated to run the page, it can be hard to write tests that focus exclusively on individual parts of the application. Tests for Web Forms-based ASP.NET applications can therefore be more difficult to implement than tests in an MVC application. Moreover, tests in a Web Forms-based ASP.NET application require a Web server. The MVC framework decouples the components and makes heavy use of interfaces, which makes it possible to test individual components in isolation from the rest of the framework.
The loose coupling between the three main components of an MVC application also promotes parallel development. For instance, one developer can work on the view, a second developer can work on the controller logic, and a third developer can focus on the business logic in the model.
Deciding When to Create an MVC Application
You must consider carefully whether to implement a Web application by using either the ASP.NET MVC framework or the ASP.NET Web Forms model. The MVC framework does not replace the Web Forms model; you can use either framework for Web applications. (If you have existing Web Forms-based applications, these continue to work exactly as they always have.)
Advantages of an MVC-Based Web Application
· It makes it easier to manage complexity by dividing an application into the model, the view, and the controller.
· It does not use view state or server-based forms. This makes the MVC framework ideal for developers who want full control over the behavior of an application.
· It uses a Front Controller pattern that processes Web application requests through a single controller. This enables you to design an application that supports a rich routing infrastructure. For more information, see Front Controller on the MSDN Web site.
· It provides better support for test-driven development (TDD).
· It works well for Web applications that are supported by large teams of developers and Web designers who need a high degree of control over the application behavior.
Advantages of a Web Forms-Based Web Application
· It supports an event model that preserves state over HTTP, which benefits line-of-business Web application development. The Web Forms-based application provides dozens of events that are supported in hundreds of server controls.
· It uses a Page Controller pattern that adds functionality to individual pages. For more information, see Page Controller on the MSDN Web site.
· It uses view state or server-based forms, which can make managing state information easier.
· It works well for small teams of Web developers and designers who want to take advantage of the large number of components available for rapid application development.
· In general, it is less complex for application development, because the components (the Page class, controls, and so on) are tightly integrated and usually require less code than the MVC model.
Features of the ASP.NET MVC Framework
· Separation of application tasks (input logic, business logic, and UI logic), testability, and test-driven development (TDD) by default. All core contracts in the MVC framework are interface-based and can be tested by using mock objects, which are simulated objects that imitate the behavior of actual objects in the application. You can unit-test the application without having to run the controllers in an ASP.NET process, which makes unit testing fast and flexible. You can use any unit-testing framework that is compatible with the .NET Framework.
· An extensible and pluggable framework. The components of the ASP.NET MVC framework are designed so that they can be easily replaced or customized. You can plug in your own view engine, URL routing policy, action-method parameter serialization, and other components. The ASP.NET MVC framework also supports the use of Dependency Injection (DI) and Inversion of Control (IOC) container models. DI allows you to inject objects into a class, instead of relying on the class to create the object itself. IOC specifies that if an object requires another object, the first objects should get the second object from an outside source such as a configuration file. This makes testing easier.
· A powerful URL-mapping component that lets you build applications that have comprehensible and searchable URLs. URLs do not have to include file-name extensions, and are designed to support URL naming patterns that work well for search engine optimization (SEO) and representational state transfer (REST) addressing.
· Support for using the markup in existing ASP.NET page (.aspx files), user control (.ascx files), and master page (.master files) markup files as view templates. You can use existing ASP.NET features with the ASP.NET MVC framework, such as nested master pages, in-line expressions (<%= %>), declarative server controls, templates, data-binding, localization, and so on.
· Support for existing ASP.NET features. ASP.NET MVC lets you use features such as forms authentication and Windows authentication, URL authorization, membership and roles, output and data caching, session and profile state management, health monitoring, the configuration system, and the provider architecture.
---------------------------------------------------------------------------------------------------------------------------------------------
Many of us have experienced projects that drag on much longer than expected and cost more than planned. Companies looking to improve their software development processes are now exploring how Agile can help their Enterprise more reliably deliver software quickly, iteratively and with a feature set that hits that mark. While Agile has different "flavors", Scrum is one process for implementing Agile. This newsletter is the first in a series of newsletters that will discuss the Agile Scrum process and will end with variants of Scrum that can be used to aid in improving your software releases.
So what is Agile?
According to Wikipedia, Agile software development is a conceptual framework for software engineering that promotes development iterations throughout the life-cycle of the project. Simply put, Agile allows your team to identify the most critical features of the software that can be completed within a short time frame (normally 1 to 2 months), and it delivers a complete build with this set of limited features as the first iteration. Once that is done, you can move those features to production or continue on to the next iteration. By breaking the releases into shorter stints, it allows you to gain quicker releases and to capture return on investment more quickly by putting the working (but limited) features into production sooner. This is in stark contrast to the more traditional "Waterfall" approach, where you design all features upfront, code each one, test each one, then move into production. Agile projects are iteratively released to production months where Waterfall projects normally span a year or more before they are released to production.
So what is Scrum?
Scrum is process of implementing Agile, where features are delivered in 30 day sprints. Scrum borrows its name from Rugby, where a sprint is the process of stopping play, then vigorously playing until the sprint ends and a new one begins. The same idea applies here, where you define the requirements for a 30 day sprint and work on them with vigor for 30 days without being sidetracked by other things or having things re-prioritized. A specific feature is not recognized as being completed until it is analyzed, designed, coded, tested, re-factored and documented. At the end of the 30 day sprint, most features defined in the 30-day sprint should be completed. If some did not get finished (because of being underestimated), the uncompleted features can be moved to a later sprint. A sprint is considered successful if all the completed features have high quality and can be put into production (or beta) upon ending the sprint.
Do Team Member Responsibilities Change?
Managing Scrum development requires a major change in how teams work together. In traditional Waterfall development, teams normally have a project sponsor, a project manager, analysts, designers, programmers, testers, and documentation specialists. Each team member has specific duties which normally do not overlap and they have a specific reporting structure (most team members report to the project manager).
With Scrum, you have just 3 team roles and they are normally limited to 7 or less individuals (however, you can have multiple Scrum teams in sets of 7 or less):
- Product Owner - This is the person that identifies and prioritizes the features that will appear in a 30 day sprint. This is normally the CEO, CTO, or some other high level stakeholder that ultimately is responsible for shaping the roadmap of their product.
- ScrumMaster - The ScrumMaster is akin to the Project Manager in Waterfall environments, but does not manage the team deliverables at a micro level. Instead, this person is responsible for ensuring that the 30 day sprint stays on course, no new features are added to the sprint, code inspection, and ensuring everyone plays by the rules.
- The Team - With Waterfall, a team consists of analysts, designers, testers and documentation specialists. With Scrum, each team member is empowered and expected to self-manage themselves and to participate in all duties needed to deliver a feature. This includes analysis, design, coding, testing and documentation.
So how does Scrum Work on a Day-by-Day Basis?
Scrum begins with an 8 hour Scrum Kickoff Meeting. The Scrum Kickoff meeting is divided into (2) 4 hour segments, where you first determine what features are desired for the 30 day sprint. The last 4 hours are used to provide rough estimates for the items identified for the sprint. If the estimates exceed the available resources, the features are prioritized and less important features are dropped from the sprint. An important component of Scrum is using a time-box approach, where meetings and events have a definite time period (e.g. no more than 8 hours for the kickoff meeting) and this time-box is strictly enforced. Once the features are locked in for the 30-day sprint, no changes are allowed (new features can not be introduced until the next sprint). When estimating features for a sprint, the estimates must include time for analysis, design, coding, testing, re-factoring, and documentation. A feature is not considered complete until all those things are done.
Each day, a Daily Scrum Meeting is held to determine how the features are progressing. The meeting is no longer than 15 minutes, and each team member is asked 3 questions:
- What have you accomplished since the last Daily Scrum Meeting?
- What will you do before the next Daily Scrum Meeting?
- Is there anything that is impeding your progress (and remedies are discussed)?
From a programmer's perspective, Scrum development is a new paradigm which is very empowering but does require them to follow specific rules:
- Code is only checked out for the duration needed to complete a feature. No exceptions. Most code will be checked in daily, as most features are broken down into small feature sets.
- Time must be entered daily. For each feature, you will have estimated hours, actual hours and hours remaining to complete the feature. This information must be updated at the end of every day so that the ScrumMaster can determine if the release progress is trending as required.
- Programmers are not allowed to be pulled off on tangent projects, they must stick to the features they have been assigned for the sprint.
- All team members must attend the Daily Scrum Meeting and must be on time.
- Code is compiled and deployed to a test server daily. Teams can use automated build tools to speed up this process. Automated tests should be run against the daily releases to discover any issues introduced by the release.
Once a Scrum 30 day sprint is completed, all features that were completed can then be moved to a beta or production environment. Following the sprint is a Retrospective (post mortem), where team members discuss and document things that went well and things that can be improved upon in the next sprint.
Methodology
1. organizing system: the methods or organizing principles underlying a particular art, science, or other area of study 2. study of organizing principles: in philosophy, the study of organizing principles and underlying rules 3. study of research methods: the study of methods of research
"the analysis of the principles of methods, rules, and postulates employed by a discipline";"the systematic study of methods that are, can be, or have been applied within a discipline".
ADO.NET Entity Framework
The ADO.NET Entity Framework is designed to enable developers to create data access applications by programming against a conceptual application model instead of programming directly against a relational storage schema. The goal is to decrease the amount of code and maintenance required for data-oriented applications. Entity Framework applications provide the following benefits:
1. Applications can work in terms of a more application-centric conceptual model, including types with inheritance, complex members, and relationships.
2. Applications are freed from hard-coded dependencies on a particular data engine or storage schema.
3. Mappings between the conceptual model and the storage-specific schema can change without changing the application code.
4. Developers can work with a consistent application object model that can be mapped to various storage schemas, possibly implemented in different database management systems.
5. Multiple conceptual models can be mapped to a single storage schema.
6. Language-integrated query (LINQ) support provides compile-time syntax validation for queries against a conceptual model.
Windows Communication Foundation (WCF) is an SDK for developing and deploying services on Windows. WCF provides a runtime environment for your services, enabling you to expose CLR types as services, and to consume other services as CLR types. Although in theory you could build services without WCF, in practice building services is significantly easier with WCF. WCF is Microsoft's implementation of a set of industry standards defining service interactions, type conversion, marshaling, and various protocols' management. Because of that, WCF provides interoperability between services. WCF provides developers with the essential off-the-shelf plumbing required by almost any application, and as such, it greatly increases productivity. The first release of WCF provides many useful facilities for developing services, such as hosting, service instance management, asynchronous calls, reliability, transaction management, disconnected queued calls, and security. WCF also has an elegant extensibility model that you can use to enrich the basic offering. In fact, WCF itself is written using this extensibility model. The rest of the chapters in this book are dedicated to those aspects and features. Most all of the WCF functionality is included in a single assembly called System.ServiceModel.dll in the System.ServiceModel namespace.
Saturday, January 30, 2010
SQL Server Reference Guide
DATA TYPES IN SQLSERVER
Exact numerics
Type | From | To |
Bigint | -9,223,372,036,854,775,808 | 9,223,372,036,854,775,807 |
Int | -2,147,483,648 | 2,147,483,647 |
Smallint | -32,768 | 32,767 |
Tinyint | 0 | 255 |
bit | 0 | 1 |
decimal | -10^38 +1 | 10^38 –1 |
numeric | -10^38 +1 | 10^38 –1 |
money | -922,337,203,685,477.5808 | +922,337,203,685,477.5807 |
smallmoney | -214,748.3648 | +214,748.3647 |
numeric and decimal are Fixed precision and scale data types and are functionally equivalent.
Approximate numerics
Type | From | To |
float | -1.79E + 308 | 1.79E + 308 |
real | -3.40E + 38 | 3.40E + 38 |
datetime and smalldatetime
Type | From | To |
datetime (3.33 milliseconds accuracy) | Jan 1, 1753 | Dec 31, 9999 |
smalldatetime (1 minute accuracy) | Jan 1, 1900 | Jun 6, 2079 |
Character Strings
Type | Description |
char | Fixed-length non-Unicode character data with a maximum length of 8,000 characters. |
varchar | Variable-length non-Unicode data with a maximum of 8,000 characters. |
varchar(max) | Variable-length non-Unicode data with a maximum length of 231 characters (SQL Server 2005 only). |
text | Variable-length non-Unicode data with a maximum length of 2,147,483,647 characters. |
Unicode Character Strings
Type | Description |
nchar | Fixed-length Unicode data with a maximum length of 4,000 characters. |
nvarchar | Variable-length Unicode data with a maximum length of 4,000 characters. |
nvarchar(max) | Variable-length Unicode data with a maximum length of 230 characters (SQL Server 2005 only). |
ntext | Variable-length Unicode data with a maximum length of 1,073,741,823 characters. |
Binary Strings
Type | Description |
binary | Fixed-length binary data with a maximum length of 8,000 bytes. |
varbinary | Variable-length binary data with a maximum length of 8,000 bytes. |
varbinary(max) | Variable-length binary data with a maximum length of 231 bytes (SQL Server 2005 only). |
image | Variable-length binary data with a maximum length of 2,147,483,647 bytes. |
Other Data Types
- sql_variant: Stores values of various SQL Server-supported data types, except text, ntext, and timestamp.
- timestamp: Stores a database-wide unique number that gets updated every time a row gets updated.
- uniqueidentifier: Stores a globally unique identifier (GUID).
- xml: Stores XML data. You can store xml instances in a column or a variable (SQL Server 2005 only).
- cursor: A reference to a cursor.
- table: Stores a result set for later processing.
Date and Time Functions
Date and time functions enable a programmer to get the system date, as well as to manipulate the date and time values stored in the database. Because date and time functions are useful in many different circumstances, it's difficult to emphasize one particular usage as being more essential than the others.
One complaint that you might hear is that SQL Server does not allow storing only the date or only the time—you must store both date and time in the same column if you use the DATETIME or SMALLDATETIME data type. Of course, you have the alternative of storing date values as strings, as in '1/1/2003'. Another alternative is to use the DATETIME data type and then use one of the date and time functions to retrieve only the needed portion (date or time) from the table.
One of my other InformIT.com articles covered the functions DATEPART and DATENAME in detail. DATEPART retrieves a portion of the date and time value (month, day, weekday, year, hour, minute, and so on) as an integer; DATENAME returns the character (for weekdays) or integer (for everything but weekdays) representation. DATEPART is deterministic in all cases except when it refers to weekdays. DATENAME, on the other hand is non-deterministic. (Please refer to my article "SQL Server: Determining Whether a Date is a Business Day" for an extensive coverage of these functions.)
The DAY(), MONTH(), and YEAR() functions are deterministic. Each of these functions accepts a single date value as a parameter and returns respective portions of the date as an integer. All three of these functions can be duplicated by retrieving the same portions of the date using the DATEPART function, as shown in the following example:
SELECT
DAY('1/1/2003'),
MONTH('1/1/2003'),
YEAR('1/1/2003'),
DATEPART(DAY, '1/1/2003'),
DATEPART(MONTH, '1/1/2003'),
DATEPART(YEAR, '1/1/2003')
Results:
----------- ----------- ----------- ----------- ----------- -----------
1 1 2003 1 1 2003
GETDATE() and GETUTCDATE() both return the current date and time. However, GETUTCDATE() returns the current Universal Time Coordinate (UTC) time, whereas GETDATE() returns the date and time on the computer where SQL Server is running. By the way, GETUTCDATE() does not have any magic power for determining the appropriate UTC time—it simply compares the time zone of SQL Server computer with the UTC time zone. Note that neither of these functions accepts parameters, and they are both non-deterministic. Here is an example:
SELECT
GETDATE() AS local_date,
GETUTCDATE() AS UTC_date
Results:
local_date UTC_date
------------------------- ----------------------------
2003-01-26 14:32:35.713 2003-01-26 20:32:35.713
The DATEADD() and DATEDIFF() functions are both deterministic and can be very helpful in reporting applications (among other uses). The DATEADD() function adds a certain period of time to the existing date and time value. For instance, you can use the following query to determine the date six months from today:
SELECT DATEADD(MONTH, 6, GETDATE())AS '6_months_from_now'
Results:
6_months_from_now
------------------------
2003-07-26 14:38:55.960
DATEADD() is also often used to determine which rows qualify for a particular report. Suppose that you want to see the report of all titles that have sold in the past nine years. you can effectively use DATEADD() with -9 as a parameter, as follows:
SELECT
DISTINCT
a.title_id,
title,
YEAR(ord_date) AS year_sold
FROM sales a INNER JOIN titles b
ON a.title_id = b.title_id
WHERE
ord_date > = DATEADD(YEAR, -9, GETDATE())
Results:
title_id title year_sold
-------- ---------------------------------------- -----------
BU1032 The Busy Executive's Database Guide 1994
BU2075 You Can Combat Computer Stress! 2002
MC3021 The Gourmet Microwave 1994
PS2091 Is Anger the Enemy? 1994
The DATEDIFF() function accepts two DATETIME values and a date portion (minute, hour, day, month, and so on) as parameters. DATEDIFF() determines the difference between the two date values passed, expressed in the date portion specified. Notice that the start date should come before the end date if you want to see positive numbers in the result set. For instance, the following query determines the time difference between today and when each sale occurred in terms of months (the output is limited to the first five rows to save some room):
SELECT TOP 5
ord_date,
DATEDIFF(MONTH, ord_date, GETDATE()) AS no_of_months_since
FROM sales
Results:
ord_date no_of_months_since
---------------------------- -----------------
1994-09-14 00:00:00.000 100
1994-09-13 00:00:00.000 100
1993-05-24 00:00:00.000 116
1994-09-13 00:00:00.000 100
1994-09-14 00:00:00.000 100
DATEDIFF() will work even if the end date is earlier than the start date—you will simply get negative values in the output. Keep in mind that DATEDIFF() returns an INTEGER; it does not calculate fractions for you. This might not seem relevant at first glance, but check out what happens when you compare the beginning and end of 2002 with the first day of 2003:
SELECT
DATEDIFF (YEAR, '1/1/2002', '1/1/2003'),
DATEDIFF (YEAR, '12/31/2002', '1/1/2003')
Results:
----------- -----------
1 1
So whether it is the first or last day of 2002, the difference between years is still 1, regardless of what you might expect
Mathematical Functions
The large majority of math functions are specific to applications using trigonometry, calculus, and geometry. Because such functions very seldom used in typical business applications, there is only a very light coverage of these functions in this article.
There are, however, a few math functions that deserve more attention and have a more widespread usage. One of these is RAND(), which generates a random number between 0 and 1, expressed as a FLOAT data type. Returning random numbers can be useful for various programming tasks: returning result sets in a random order, for example.
The RAND() function can be executed without any parameters, or you can specify a seed value that can have a TINYINT, SMALLINT, or INT data type. If a seed is specified, RAND()always returns the same number. A trick you might find useful is to pass the current millisecond to the RAND() as the seed, as in the following example:
SELECT RAND(DATEPART(MILLISECOND, GETDATE()))
NOTE that all mathematical functions are deterministic except the RAND() function used without the seed.
The CEILING() and FLOOR() functions accept a numeric expression as a single parameter and work similar to each other. CEILING() returns the smallest integer value greater than or equal to the parameter, whereas FLOOR() returns the largest integer less than or equal to the parameter. The following example returns the price range for a few titles in the pubs database:
SELECT TOP 4 title_id, price, CEILING(price) AS price_ceiling, FLOOR(price) AS price_floor FROM titles
Results:
title_id price price_ceiling price_floor
-------- ------------- --------------- ---------------
BU1032 19.9900 20.0000 19.0000
BU1111 11.9500 12.0000 11.0000
BU2075 2.9900 3.0000 2.0000
BU7832 19.9900 20.0000 19.0000
The ABS() function returns the absolute (positive) value of the given expression. It provides a way to have positive values in the output, even if values within tables are negative. For instance, suppose you are generating an accounting report that must contain credit and debit values in separate columns (both in positive), but your source column for both credit and debit is the balance column. You can generate the report as follows:
DECLARE @account TABLE (
account_number INT,
balance INT)
INSERT @account
VALUES(123, -200)
INSERT @account
VALUES(321, 2)
SELECT
account_number,
'credit' = CASE WHEN balance < 0 THEN ABS(balance) ELSE 0 END,
'debit' = CASE WHEN balance > = 0 THEN balance ELSE 0 END
FROM @account
Results:
account_number credit debit
-------------- ----------- -----------
123 200 0
321 0 2
The following table summarizes the rest of the math functions.
Mathematical Function | Returns |
ACOS() | Angle whose cosine expressed in radians is the expression passed |
ASIN() | Angle whose sine expressed in radians is the expression passed |
ATAN() | Angle whose tangent expressed in radians is the expression passed |
ATN2() | Angle whose tangent expressed in radians is between the two expressions passed |
COS() | Cosine of the given angle in radians |
COT() | Cotangent of the given angle in radians |
DEGREES() | Angle in degrees for the angle provided in radians |
EXP() | Exponential value of the passed expression |
LOG() | Natural logarithm of the passed expression |
LOG10() | Base 10 logarithm of the passed expression |
PI() | Value of PI: Does not accept any parameters |
POWER() | Value of the passed expression into the specified power (as in 2 to the 4th power) |
RADIANS() | Angle in radians for the angle provided in degrees |
ROUND() | Value of the passed parameter, rounded to the specified precision |
SIGN() | +1 for non-negative numeric expressions; (-1) for negative numeric expressions |
SIN() | Sine of the angle provided in radians |
SQUARE() | Square of the given numeric expression |
SQRT() | Square root of the given numeric expression |
TAN() | Tangent of the given angle in radians |
Text and Image Functions
Text and image functions operate on the TEXT, NTEXT and IMAGE data types. Here's a database design hint: In most cases, it is prudent to keep large text in a document on the file system and store a link to that document within the database. Columns with TEXT data type do not support commonly used string functions such as LEN, LEFT, RIGHT, and so on. Further, due to the large size of TEXT data, you're likely to have performance issues if you store such data in the database. However, you can't help but use the TEXT data type sometimes—this is when text functions come in handy.
The PATINDEX() function is useful with the TEXT data type, but it can also be used with columns having the CHAR or VARCHAR data types. In fact, PATINDEX() is very similar to CHARINDEX(), which tends to be used more frequently with string expressions. PATINDEX() seeks for the first occurrence of a pattern within a string. If the pattern is found, PATINDEX() returns the character number where the first occurrence of the pattern begins. For instance, the following query creates a table variable with the TEXT data type and then uses PATINDEX to search for occurrences of the "tex" pattern:
DECLARE @table TABLE ( a TEXT)
INSERT @table VALUES ('ouch, this is some really long text')
SELECT PATINDEX('%tex%', a) FROM @table
Results:
-----------
32
Notice that the expression sought by PATINDEX() must be enclosed in "%" signs. Within the expression we could use wildcards "%" for any number of letters, "_" for a single character, [] to specify a range of characters and [^] to skip a range, just as you do when using LIKE comparisons. For instance, you can modify the previous query slightly with a wildcard and get a different result:
SELECT PATINDEX('%t%x%', a) FROM @table
Results:
-----------
7
CHARINDEX() differs from PATINDEX() in that it searches for an exact match, in other words, occurrence of one expression within another, rather than searching for a pattern. Therefore, you cannot use wildcards with CHARINDEX; if you do, the wildcards will be interpreted as if you were searching for that letter ("%" or "_").
The TEXTPTR() function returns the binary pointer to the TEXT, NTEXT, or IMAGE column in a table. After you get the binary value from TEXTPTR(), you can use it for READTEXT, WRITETEXT, and UPDATETEXT statements. For instance, the following query uses TEXTPTR to get pointers to the text column values within the table variable:
DECLARE @table TABLE (
a TEXT)
INSERT @table
VALUES ('ouch, this is some really long text')
INSERT @table
VALUES ('here is some more text')
SELECT TEXTPTR(a) FROM @table
Results:
----------------------------------
0x0100010000000000EC6FB91F00000000
0x0100020000000000EC6FB91F01000000
The TEXTVALID() function provides a way to see whether an existing pointer to a column with TEXT, NTEXT, or IMAGE data type is valid. The next batch of SQL statements creates a temporary table with a TEXT data type, adds a row, gets the pointer, and then ensures that the pointer is valid:
CREATE TABLE #table (
a TEXT)
DECLARE @textptr VARBINARY(320)
INSERT #table
VALUES ('ouch, this is some really long text')
SELECT @textptr = TEXTPTR(a) FROM #table
SELECT TEXTVALID('#table.a', @textptr) AS is_pointer_valid
DROP TABLE #table
Results:
is_pointer_valid
----------------
1
Triggers
Triggers are special types of Stored Procedures that are defined to execute automatically in place of or after data modifications. They can be executed automatically on the INSERT, DELETE and UPDATE triggering actions.
There are two different types of triggers in Microsoft SQL Server 2000. They are INSTEAD OF triggers and AFTER triggers. These triggers differ from each other in terms of their purpose and when they are fired. In this article we shall discuss each type of trigger.
A trigger is an object contained within an SQL Server database that is used to execute a batch of SQL code whenever a specific event occurs. As the name suggests, a trigger is "fired" whenever an INSERT, UPDATE, or DELETE SQL command is executed against a specific table.
Triggers are associated with a single table, and are automatically executed internally by SQL Server. Let's create a very basic trigger now (I am using Microsoft SQL Server 7.0 on a Windows 2000 machine).
First of all, let's create a sample database with some tables and insert some sample data in those tables using the script below:
Create Database KDMNN
GO
USE KDMNN
GO
CREATE TABLE [dbo].[User_Details] (
[UserID] [int] NULL ,
[FName] [varchar] (50) NOT NULL ,
[MName] [varchar] (50) NULL ,
[LName] [varchar] (50) NOT NULL ,
[Email] [varchar] (50) NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[User_Master] (
[UserID] [int] IDENTITY (1, 1) NOT NULL ,
[UserName] [varchar] (50) NULL ,
[Password] [varchar] (50) NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[User_Master] WITH NOCHECK ADD
CONSTRAINT [PK_User_Master] PRIMARY KEY CLUSTERED
(
[UserID]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[User_Details] ADD
CONSTRAINT [FK_User_Details_User_Master] FOREIGN KEY
(
[UserID]
) REFERENCES [dbo].[User_Master] (
[UserID]
)
GO
INSERT INTO USER_MASTER(USERNAME, PASSWORD)
SELECT 'Navneeth','Navneeth' UNION
SELECT 'Amol','Amol' UNION
SELECT 'Anil','Anil' UNION
SELECT 'Murthy','Murthy'
INSERT INTO USER_DETAILS(USERID, FNAME, LNAME, EMAIL)
SELECT 1,'Navneeth','Naik','navneeth@kdmnn.com' UNION
SELECT 2,'Amol','Kulkarni','amol@kdmnn.com' UNION
SELECT 3,'Anil','Bahirat','anil@kdmnn.com' UNION
SELECT 4,'Murthy','Belluri','murthy@kdmnn.com'
AFTER Triggers
The type of trigger that gets executed automatically after the statement that triggered it completes is called an AFTER trigger. An AFTER trigger is a trigger that gets executed automatically before the transaction is committed or rolled back.
Using the below script, first we shall create a trigger on the table USER_MASTER for the INSERT event of the table.
USE KDMNN
Go
CREATE TRIGGER trgInsert
ON User_Master
FOR INSERT
AS
Print ('AFTER Trigger [trgInsert] – Trigger executed !!')
GO
BEGIN TRANSACTION
DECLARE @ERR INT
INSERT INTO USER_MASTER(USERNAME, PASSWORD)
VALUES('Damerla','Damerla')
SET @ERR = @@Error
IF @ERR = 0
BEGIN
ROLLBACK TRANSACTION
PRINT 'ROLLBACK TRANSACTION'
END
ELSE
BEGIN
COMMIT TRANSACTION
PRINT 'COMMIT TRANSACTION'
END
Output
AFTER Trigger [trgInsert] – Trigger executed !!
(1 row(s) affected)
ROLLBACK TRANSACTION
By looking at the output, we can conclude that before the transaction is rolled back or committed, the AFTER trigger gets executed automatically. A table can have several AFTER triggers for each of the three triggering actions i.e., INSERT, DELETE and UPDATE. Using the below script, we shall create two triggers on the table User_Master for the INSERT triggering action.
CREATE TRIGGER trgInsert2
ON User_Master
FOR INSERT
AS
BEGIN
Print ('AFTER Trigger [trgInsert2] – Trigger executed !!')
END
GO
CREATE TRIGGER trgInsert3
ON User_Master
FOR INSERT
AS
BEGIN
Print ('AFTER Trigger [trgInsert3] – Trigger executed !!')
END
GO
BEGIN TRANSACTION
DECLARE @ERR INT
INSERT INTO USER_MASTER(USERNAME, PASSWORD)
VALUES('Damerla','Damerla')
SET @ERR = @@Error
IF @ERR = 0
BEGIN
ROLLBACK TRANSACTION
PRINT 'ROLLBACK TRANSACTION'
END
ELSE
BEGIN
COMMIT TRANSACTION
PRINT 'COMMIT TRANSACTION'
END
Output
AFTER Trigger [trgInsert] – Trigger executed !!
AFTER Trigger [trgInsert2] – Trigger executed !!
AFTER Trigger [trgInsert3] – Trigger executed !!
(1 row(s) affected)
ROLLBACK TRANSACTION
From the output we can conclude that when the user tries to insert data in the table USER_MASTER, three triggers are executed automatically. That is, you can write several AFTER triggers on one table for each of the three triggering actions.
Similarly, we can write several AFTER triggers on DELETE and UPDATE triggering actions.
Note: If a table has multiple AFTER triggers, then you can specify which trigger should be executed first and which trigger should be executed last using the stored procedure sp_settriggerorder. All the other triggers are in an undefined order which you cannot control.
An AFTER trigger can be created only on tables, not on views.
Using the script below, first we shall create a simple view [vwUserMaster] which will fetch the Username and Password from the table USER_MASTER.
Create View vwUserMaster
as
SELECT USERNAME, PASSWORD FROM USER_MASTER
GO
CREATE TRIGGER trgOnView
ON vwUserMaster
FOR INSERT
AS
BEGIN
Print ('AFTER Trigger [trgOnView] – vwUserMaster !!')
END
GO
Output
Server: Msg 208, Level 16, State 4, Procedure trgOnView, Line 2
Invalid object name 'vwUserMaster'.
From the Output we can conclude that we cannot create an AFTER trigger on views.
Like stored procedures and views, triggers can also be encrypted. The trigger definition is then stored in an unreadable form. Once encrypted, the definition of the trigger cannot be decrypted and cannot be viewed by anyone, including the owner of the trigger or the system administrator.
CREATE TRIGGER trgEncrypted
ON User_Master WITH ENCRYPTION
FOR INSERT
AS
BEGIN
Print ('AFTER Trigger [trgEncrypted] Encrypted – Trigger executed !!')
END
GO
SELECT
sysobjects.name AS [Trigger Name],
SUBSTRING(syscomments.text, 0, 26) AS [Trigger Definition],
OBJECT_NAME(sysobjects.parent_obj) AS [Table Name],
syscomments.encrypted AS [IsEncrpted]
FROM
sysobjects INNER JOIN syscomments ON sysobjects.id = syscomments.id
WHERE
(sysobjects.xtype = 'TR')
Output
Trigger Name Trigger Definition Table Name IsEncrpted
-----------------------------------------------------------------
trgInsert CREATE TRIGGER trgInsert User_Master 0
trgInsert1 CREATE TRIGGER trgInsert1 User_Master 0
trgInsert2 CREATE TRIGGER trgInsert2 User_Master 0
trgEncrypted ᩼恩Ꭻ羑䛆䜤녪样窈ᒢꢞ뾴䕭遚� User_Master 1
Since the trigger trgEncrypted is created with the option WITH ENCRYPTION, the trigger definition is hidden and there is no way that one can easily decrypt the trigger code.
We all know that the DML statements change or modify data. Sometimes it becomes necessary for the triggers to have the access to the changes being caused by the DML statements. SQL Server 2000 provides four different ways to determine the affects of the DML statements. The INSERTED and DELETED tables, popularly known as MAGIC TABLES, and update () and columns_updated() functions can be used to determine the changes being caused by the DML statements.
The below table depicts the contents of the INSERTED and DELETED tables for three different table Events
EVENT | INSERTED | DELETED |
Insert | Contains the inserted rows | Empty |
Delete | Empty | Contains the rows to be deleted |
Update | Contains the rows after update | Contains the rows before update |
Note that the Magic Table does not contain the information about the columns of the data-type text, ntext, or image. Attempting to access these columns will cause an error.
The update() function is used to find whether a particular column has been updated or not. This function is generally used for data checks.
CREATE TRIGGER trgUddate
ON User_Details
FOR UPDATE
AS
If UPDATE(FName)
BEGIN
PRINT('AFTER Trigger [trgUddate] - Executed - First Name has been updated')
ROLLBACK TRANSACTION
END
else If UPDATE(LName)
BEGIN
PRINT('AFTER Trigger [trgUddate] - Executed - Last Name has been updated')
ROLLBACK TRANSACTION
END
else If UPDATE(MName)
BEGIN
PRINT('AFTER Trigger [trgUddate] - Executed - MName Name has been updated')
ROLLBACK TRANSACTION
END
else If UPDATE(Email)
BEGIN
PRINT('AFTER Trigger [trgUddate] - Executed - Email has been updated')
ROLLBACK TRANSACTION
END
GO
UPDATE User_Details
SET MName = 'Diwaker'
WHERE UserID = 1
Output
AFTER Trigger [trgUddate] - Executed - MName Name has been updated
Depending upon the column updated, a message will be displayed. With this feature we can determine which column in the table has been updated, and then proceed with the business rules to be implemented further.
Columns_Update() function returns a varbinary data type representation of the columns updated. This function return a hexadecimal values from which we can determine which columns in the table have been updated.
INSTEAD OF Triggers
A trigger which gets executed automatically in place of triggering actions i.e., INSERT, DELETE and UPDATE is called an INSTEAD OF trigger.
INSTEAD OF triggers gets executed automatically before the Primary Key and the Foreign Key constraints are checked, whereas the traditional AFTER triggers gets executed automatically after these constraints are checked.
CREATE TRIGGER trgAfterInsert
On User_Details
FOR INSERT
AS
BEGIN
Print ('AFTER Trigger [trgAfterInsert] – Trigger executed !!')
END
INSERT INTO USER_DETAILS(USERID, FNAME,LNAME, MNAME, EMAIL)
VALUES(100, 'FName','LName','MName','test@test.com')
Output
Server: Msg 547, Level 16, State 1, Line 1
INSERT statement conflicted with COLUMN FOREIGN KEY constraint 'FK_User_Details_User_Master'. The conflict occurred in database 'KDMNN', table 'User_Master', column 'UserID'.
The statement has been terminated.
UserID 100 does not exist in the User_Master table, so the Foreign Key constraint has been checked and an error message is displayed. What we can conclude is: AFTER triggers gets executed automatically after the PK and FK constraints.
Create Trigger trgInsteadInsert
On User_Details
INSTEAD OF INSERT
AS
BEGIN
Print ('INSTEAD OF Trigger [trgInsteadInsert] – Trigger executed !!')
END
INSERT INTO USER_DETAILS(USERID, FNAME,LNAME, MNAME, EMAIL)
VALUES(100, 'FName','LName','MName','test@test.com')
Output
INSTEAD OF Trigger [trgInsteadInsert] – Trigger executed !!
(1 row(s) affected)
Even if the UserID 100 does not exists in the User_Master table, the trigger gets gets executed automatically.
DROP Trigger trgInsteadInsert
Unlike AFTER triggers, INSTEAD OF triggers can be created on views.
Create trigger trgOnView
on vwUserMaster
INSTEAD OF INSERT
AS
begin
Print ('INSTEAD OF Trigger [trgOnView] – vwUserMaster !!!')
End
INSERT INTO VWUSERMASTER(USERNAME, PASSWORD)
VALUES('Damerla','Venkat')
Output
INSTEAD OF Trigger [trgOnView] – vwUserMaster !!
(1 row(s) affected)
So whenever a user tries to insert data into the view vwUserMaster, the INSTEAD OF trigger trgOnView will automatically be executed.
In SQL SERVER 2000, views can be used to INSERT/DELETE and UPDATE the data in the multiple tables, and this can be achieved using INSTEAD OF triggers.
CREATE VIEW vwUser
AS
SELECT
[User_Master].[Username],
[User_Master].[Password],
[User_Details].[FName],
[User_Details].[MName],
[User_Details].[LName],
[User_Details].[Email]
FROM
[User_Master], [User_Details]
WHERE
[User_Master].[UserID]=[User_Details].[UserID]
CREATE TRIGGER tgrInsertData
ON vwUser
INSTEAD OF INSERT
AS
BEGIN
Declare @UserName varchar(50)
Declare @Password varchar(50)
Declare @FName varchar(50)
Declare @MName varchar(50)
Declare @LName varchar(50)
Declare @Email varchar(50)
SELECT
@UserName = UserName,
@Password = Password,
@FName = FName,
@MName = MName,
@LName = LName,
@Email = Email
FROM INSERTED
INSERT INTO User_Master(UserName, Password)
VALUES(@UserName, @Password)
INSERT INTO User_Details(UserID,FName,LName,MName,Email) VALUES(@@Identity, @FName, @LName, @MName, @Email)
END
INSERT INTO vwUser(UserName,Password,FName,LName,MName,Email)
VALUES ('Dhananjay','Dhananjay','Dhananjay','Nagesh',NULL,
Output
AFTER Trigger [trgInsert] – Trigger executed !!
AFTER Trigger [trgInsert2] – Trigger executed !!
AFTER Trigger [trgInsert3] – Trigger executed !!
AFTER Trigger [trgEncrypted] Encrypted – Trigger executed !!
(1 row(s) affected)
AFTER Trigger [trgAfterInsert] – Trigger executed !!
(1 row(s) affected)
Then check the data in the following tables User_Master and User_Details. The new row gets inserted in both the tables.
A view or table can have only one INSTEAD OF trigger for each INSERT, UPDATE and DELETE events.
We have seen that you can create any number of AFTER triggers on the table for the same event, whereas you cannot do that with INSTEAD OF triggers.
CREATE TRIGGER trgInsteadOfTrigger1
ON vwUserMaster
INSTEAD OF UPDATE
AS
BEGIN
Print ('INSTEAD OF Trigger [trgInsteadOfTrigger1] – Trigger executed !!')
END
CREATE TRIGGER trgInsteadOfTrigger2
ON vwUserMaster
INSTEAD OF UPDATE
AS
BEGIN
Print ('INSTEAD OF Trigger [trgInsteadOfTrigger2] – Trigger executed !!')
END
Output
Server: Msg 2111, Level 16, State 1, Procedure trgInsteadOfTrigger2, Line 6
Cannot CREATE trigger 'trgInsteadOfTrigger2' for view 'vwUserMaster' because an INSTEAD OF UPDATE trigger already exists.
From the output, it is clear that you cannot create two INSTEAD OF triggers on the view/ table for the same event.
Note: An important point to be noted is that INSTEAD OF DELETE and INSTEAD OF UPDATE triggers cannot be defined on tables that have corresponding ON DELETE or ON UPDATE cascading referential integrity defined.
At last, how would you know what are the triggers associated with the table and what type of the trigger it is? Whether AFTER or INSTEAD OF?
The solution for this question is sp_helptrigger. This stored procedure gives all the information about the triggers such as Event on which the trigger gets executed, the type of the trigger, etc.
Sp_helptrigger User_Master
Output
trigger_name trigger_owner isupdate isdelete isinsert isafter isinsteadof
trgInsert dbo 0 0 1 1 0
trgInsert2 dbo 0 0 1 1 0
trgInsert3 dbo 0 0 1 1 0
trgEncrypted dbo 0 0 1 1 0
Triggers can be used in the following scenarios, such as if the database is de-normalized and requires an automated way to update redundant data contained in multiple tables, or if customized messages and complex error handling are required, or if a value in one table must be validated against a non-identical value in another table.
Triggers are a powerful tool that can be used to enforce the business rules automatically when the data is modified. Triggers can also be used to maintain the data integrity. But they are not to maintain data integrity. Triggers should be used to maintain the data integrity only if you are unable to enforce the data integrity using CONSTRAINTS, RULES and DEFAULTS. Triggers cannot be created on the temporary tables.
SQL Server Indexes
Relational databases like SQL Server use indexes to find data quickly when a query is processed. Creating and removing indexes from a database schema will rarely result in changes to an application's code; indexes operate 'behind the scenes' in support of the database engine. However, creating the proper index can drastically increase the performance of an application.
The SQL Server engine uses an index in much the same way a reader uses a book index. For example, one way to find all references to INSERT statements in a SQL book would be to begin on page one and scan each page of the book. We could mark each time we find the word INSERT until we reach the end of the book. This approach is pretty time consuming and laborious. Alternately, we can also use the index in the back of the book to find a page number for each occurrence of the INSERT statements. This approach produces the same results as above, but with tremendous savings in time.
When a SQL Server has no index to use for searching, the result is similar to the reader who looks at every page in a book to find a word: the SQL engine needs to visit every row in a table. In database terminology we call this behavior a table scan, or just scan.
A table scan is not always a problem, and is sometimes unavoidable. However, as a table grows to thousands of rows and then millions of rows and beyond, scans become correspondingly slower and more expensive.
Consider the following query on the Products table of the Northwind database. This query retrieves products in a specific price range.
SELECT ProductID, ProductName, UnitPrice
FROM Products WHERE (UnitPrice > 12.5) AND (UnitPrice < 14)
There is currently no index on the Product table to help this query, so the database engine performs a scan and examines each record to see if UnitPrice falls between 12.5 and 14. In the diagram below, the database search touches a total of 77 records to find just three matches.
Now imagine if we created an index, just like a book index, on the data in the UnitPrice column. Each index entry would contain a copy of the UnitPrice value for a row, and a reference (just like a page number) to the row where the value originated. SQL will sort these index entries into ascending order. The index will allow the database to quickly narrow in on the three rows to satisfy the query, and avoid scanning every row in the table.
Create An Index
Having a data connection in the Server Explorer view of Visual Studio.NET allows us to easily create new indexes:
- Navigate to the Products table of the Northwind database.
- Right click the table and select Design Table from the context menu.
- With the design screen in focus, click the Indexes/Keys item on the View menu of the IDE.
The dialog is currently displaying an existing index on the Products table: the PK_Products index. We will see later in this chapter how primary key fields are automatically indexed to enforce uniqueness in the key values.
- In the above dialog click on the New button, and in the Index name text box, replace the existing entry with IDX_UnitPrice.
- Beneath the text box is a control where we set the columns to index. Pull down the entry with ProductID and select the UnitPrice column instead.
- Leave all of the other options with default settings.
- Close the dialog and the table design view, making sure to save all of the changes when prompted to do so. The IDE will then issue the commands to create the new index.
We can create the same index using the following SQL. The command specifies the name of the index (IDX_UnitPrice), the table name (Products), and the column to index (UnitPrice).
CREATE INDEX [IDX_UnitPrice] ON Products (UnitPrice)
To verify that the index is created, use the following stored procedure to see a list of all indexes on the Products table:
EXEC sp_helpindex Customers
How It Works
The database takes the columns specified in a CREATE INDEX command and sorts the values into a special data structure known as a B-tree. A B-tree structure supports fast searches with a minimum amount of disk reads, allowing the database engine to quickly find the starting and stopping points for the query we are using.
Conceptually, we may think of an index as shown in the diagram below. On the left, each index entry contains the index key (UnitPrice). Each entry also includes a reference (which points) to the table rows which share that particular value and from which we can retrieve the required information.
Much like the index in the back of a book helps us to find keywords quickly, so the database is able to quickly narrow the number of records it must examine to a minimum by using the sorted list of UnitPrice values stored in the index. We have avoided a table scan to fetch the query results. Given this sketch of how indexes work, lets examine some of the scenarios where indexes offer a benefit.
Taking Advantage of Indexes
The database engine can use indexes to boost performance in a number of different queries. Sometimes these performance improvements are dramatic. An important feature of SQL Server 2000 is a component known as the query optimizer. The query optimizer's job is to find the fastest and least resource intensive means of executing incoming queries. An important part of this job is selecting the best index or indexes to perform the task. In the following sections we will examine the types of queries with the best chance of benefiting from an index.
Searching For Records
The most obvious use for an index is in finding a record or set of records matching a WHERE clause. Indexes can aid queries looking for values inside of a range (as we demonstrated earlier), as well as queries looking for a specific value. By way of example, the following queries can all benefit from an index on UnitPrice:
DELETE FROM Products WHERE UnitPrice = 1
UPDATE Products SET Discontinued = 1 WHERE UnitPrice > 15
SELECT * FROM PRODUCTS WHERE UnitPrice BETWEEN 14 AND 16
Indexes work just as well when searching for a record in DELETE and UPDATE commands as they do for SELECT statements.
Sorting Records
When we ask for a sorted dataset, the database will try to find an index and avoid sorting the results during execution of the query. We control sorting of a dataset by specifying a field, or fields, in an ORDER BY clause, with the sort order as ASC (ascending) or DESC (descending). For example, the following query returns all products sorted by price:
SELECT * FROM Products ORDER BY UnitPrice ASC
With no index, the database will scan the Products table and sort the rows to process the query. However, the index we created on UnitPrice (IDX_UnitPrice) earlier provides the database with a presorted list of prices. The database can simply scan the index from the first entry to the last entry and retrieve the rows in sorted order.
The same index works equally well with the following query, simply by scanning the index in reverse.
SELECT * FROM Products ORDER BY UnitPrice DESC
Grouping Records
We can use a GROUP BY clause to group records and aggregate values, for example, counting the number of orders placed by a customer. To process a query with a GROUP BY clause, the database will often sort the results on the columns included in the GROUP BY. The following query counts the number of products at each price by grouping together records with the same UnitPrice value.
SELECT Count(*), UnitPrice FROM Products GROUP BY UnitPrice
The database can use the IDX_UnitPrice index to retrieve the prices in order. Since matching prices appear in consecutive index entries, the database is able count the number of products at each price quickly. Indexing a field used in a GROUP BY clause can often speed up a query.
Maintaining a Unique Column
Columns requiring unique values (such as primary key columns) must have a unique index applied. There are several methods available to create a unique index. Marking a column as a primary key will automatically create a unique index on the column. We can also create a unique index by checking the Create UNIQUE checkbox in the dialog shown earlier. The screen shot of the dialog displayed the index used to enforce the primary key of the Products table. In this case, the Create UNIQUE checkbox is disabled, since an index to enforce a primary key must be a unique index. However, creating new indexes not used to enforce primary keys will allow us to select the Create UNIQUE checkbox. We can also create a unique index using SQL with the following command:
CREATE UNIQUE INDEX IDX_ProductName On Products (ProductName)
The above SQL command will not allow any duplicate values in the ProductName column, and an index is the best tool for the database to use to enforce this rule. Each time an application adds or modifies a row in the table, the database needs to search all existing records to ensure none of values in the new data duplicate existing values. Indexes, as we should know by now, will improve this search time.
Index Drawbacks
There are tradeoffs to almost any feature in computer programming, and indexes are no exception. While indexes provide a substantial performance benefit to searches, there is also a downside to indexing. Let's talk about some of those drawbacks now.
Indexes and Disk Space
Indexes are stored on the disk, and the amount of space required will depend on the size of the table, and the number and types of columns used in the index. Disk space is generally cheap enough to trade for application performance, particularly when a database serves a large number of users. To see the space required for a table, use the sp_spaceused system stored procedure in a query window.
EXEC sp_spaceused Orders
Given a table name (Orders), the procedure will return the amount of space used by the data and all indexes associated with the table, like so:
Name rows reserved data index_size unused
------- -------- ----------- ------ ---------- -------
Orders 830 504 KB 160 KB 320 KB 24 KB
According to the output above, the table data uses 160 kilobytes, while the table indexes use twice as much, or 320 kilobytes. The ratio of index size to table size can vary greatly, depending on the columns, data types, and number of indexes on a table.
Indexes and Data Modification
Another downside to using an index is the performance implication on data modification statements. Any time a query modifies the data in a table (INSERT, UPDATE, or DELETE), the database needs to update all of the indexes where data has changed. As we discussed earlier, indexing can help the database during data modification statements by allowing the database to quickly locate the records to modify, however, we now caveat the discussion with the understanding that providing too many indexes to update can actually hurt the performance of data modifications. This leads to a delicate balancing act when tuning the database for performance.
In decision support systems and data warehouses, where information is stored for reporting purposes, data remains relatively static and report generating queries outnumber data modification queries. In these types of environments, heavy indexing is commonplace in order to optimize the reports generated. In contrast, a database used for transaction processing will see many records added and updated. These types of databases will use fewer indexes to allow for higher throughput on inserts and updates.
Every application is unique, and finding the best indexes to use for a specific application usually requires some help from the optimization tools offered by many database vendors. SQL Server 2000 and Access include the Profiler and Index Tuning Wizard tools to help tweak performance.
Now we have enough information to understand why indexes are useful and where indexes are best applied. It is time now to look at the different options available when creating an index and then address some common rules of thumb to use when planning the indexes for your database.
Clustered Indexes
Earlier in the article we made an analogy between a database index and the index of a book. A book index stores words in order with a reference to the page numbers where the word is located. This type of index for a database is a nonclustered index; only the index key and a reference are stored. In contrast, a common analogy for a clustered index is a phone book. A phone book still sorts entries into alphabetical order. The difference is, once we find a name in a phone book, we have immediate access to the rest of the data for the name, such as the phone number and address.
For a clustered index, the database will sort the table's records according to the column (or columns) specified by the index. A clustered index contains all of the data for a table in the index, sorted by the index key, just like a phone book is sorted by name and contains all of the information for the person inline. The nonclustered indexes created earlier in the chapter contain only the index key and a reference to find the data, which is more like a book index. You can only create one clustered index on each table.
In the diagram below we have a search using a clustered index on the UnitPrice column of the Products table. Compare this diagram to the previous diagram with a regular index on UnitPrice. Although we are only showing three columns from the Products table, all of the columns are present and notice the rows are sorted into the order of the index, there is no reference to follow from the index back to the data.
A clustered index is the most important index you can apply to a table. If the database engine can use a clustered index during a query, the database does not need to follow references back to the rest of\ the data, as happens with a nonclustered index. The result is less work for the database, and consequently, better performance for a query using a clustered index.
To create a clustered index, simply select the Create As CLUSTERED checkbox in the dialog box we used at the beginning of the chapter. The SQL syntax for a clustered index simply adds a new keyword to the CREATE INDEX command, as shown below:
CREATE CLUSTERED INDEX IDX_SupplierID ON Products(SupplierID)
Most of the tables in the Northwind database already have a clustered index defined on a table. Since we can only have one clustered index per table, and the Products table already has a clustered index (PK_Products) on the primary key (ProductId), the above command should generate the following error:
Cannot create more than one clustered index on table 'Products'.
Drop the existing clustered index 'PK_Products' before creating another.
As a general rule of thumb, every table should have a clustered index. If you create only one index for a table, use a clustered index. Not only is a clustered index more efficient than other indexes for retrieval operations, a clustered index also helps the database efficiently manage the space required to store the table. In SQL Server, creating a primary key constraint will automatically create a clustered index (if none exists) using the primary key column as the index key.
Sometimes it is better to use a unique nonclustered index on the primary key column, and place the clustered index on a column used by more queries. For example, if the majority of searches are for the price of a product instead of the primary key of a product, the clustered index could be more effective if used on the price field. A clustered index can also be a UNIQUE index.
A Disadvantage to Clustered Indexes
If we update a record and change the value of an indexed column in a clustered index, the database might need to move the entire row into a new position to keep the rows in sorted order. This behavior essentially turns an update query into a DELETE followed by an INSERT, with an obvious decrease in performance. A table's clustered index can often be found on the primary key or a foreign key column, because key values generally do not change once a record is inserted into the database.
Composite Indexes
A composite index is an index on two or more columns. Both clustered and nonclustered indexes can be composite indexes. Composite indexes are especially useful in two different circumstances. First, you can use a composite index to cover a query. Secondly, you can use a composite index to help match the search criteria of specific queries. We will go onto more detail and give examples of these two areas in the following sections.
Covering Queries with an Index
Earlier in the article we discussed how an index, specifically a nonclustered index, contains only the key values and a reference to find the associated row of data. However, if the key value contains all of the information needed to process a query, the database never has to follow the reference and find the row; it can simply retrieve the information from the index and save processing time. This is always a benefit for clustered indexes.
As an example, consider the index we created on the Products table for UnitPrice. The database copied the values from the UnitPrice column and sorted them into an index. If we execute the following query, the database can retrieve all of the information for the query from the index itself.
SELECT UnitPrice FROM Products ORDER BY UnitPrice
We call these types of queries covered queries, because all of the columns requested in the output are contained in the index itself. A clustered index, if selected for use by the query optimizer, always covers a query, since it contains all of the data in a table.
For the following query, there are no covering indexes on the Products table.
SELECT ProductName, UnitPrice FROM Products ORDER BY UnitPrice
This is because although the database will use the index on UnitPrice to avoid sorting records, it will need to follow the reference in each index entry to find the associated row and retrieve the product name. By creating a composite index on two columns (ProductName and UnitPrice), we can cover this query with the new index.
Matching Complex Search Criteria
For another way to use composite indexes, let's take a look at the OrderDetails table of Northwind. There are two key values in the table (OrderID and ProductID); these are foreign keys, referencing the Orders and Products tables respectively. There is no column dedicated for use as a primary key; instead, the primary key is the combination of the columns OrderID and ProductID.
The primary key constraint on these columns will generate a composite index, which is unique of course. The command the database would use to create the index looks something like the following:
CREATE UNIQUE CLUSTERED INDEX PK_Order_Details
ON [Order Details] (OrderID, ProductID)
The order in which columns appear in a CREATE INDEX statement is significant. The primary sort order for this index is OrderID. When the OrderID is the same for two or more records, the database will sort this subset of records on ProductID.
The order of columns determines how useful the index is for a query. Consider the phone book sorted by last name then first name. The phone book makes it easy to find all of the listings with a last name of Smith, or all of the listings with a last name of Jones and a first name of Lisa, but it is difficult to find all listings with a first name of Gary without scanning the book page by page.
Likewise, the composite index on Order Details is useful in the following two queries:
SELECT * FROM [Order Details] WHERE OrderID = 11077
SELECT * FROM [Order Details] WHERE OrderID = 11077 AND ProductID = 13
However, the following query cannot take advantage of the index we created since ProductID is the second part of the index key, just like the first name field in a phone book.
SELECT * FROM [Order Details] WHERE ProductID = 13
In this case, ProductID is a primary key, however, so an index does exist on the ProductID column for the database to use for this query.
Suppose the following query is the most popular query executed by our application, and we decided we needed to tune the database to support it.
SELECT ProductName, UnitPrice FROM Products ORDER BY UnitPrice
We could create the following index to cover the query. Notice we have specified two columns for the index: UnitPrice and ProductName (making the index a composite index):
CREATE INDEX IX_UnitPrice_ProductName ON Products(UnitPrice, ProductName)
While covered queries can provide a performance benefit, remember there is a price to pay for each index we add to a table, and we can also never cover every query in a non-trivial application.
Additional Index Guidelines
Choosing the correct columns and types for an index is another important step in creating an effective index. In this section, we will talk about two main points, namely short index keys and selective indexes (we'll explain what selective indexes are in just a moment).
Keep Index Keys Short
The larger an index key is, the harder a database has to work to use the index. For instance, an integer key is smaller in size then a character field for holding 100 characters. In particular, keep clustered indexes as short as possible.
There are several approaches to keeping an index key short. First, try to limit the index to as few columns as possible. While composite indexes are useful and can sometimes optimize a query, they are also larger and cause more disk reads for the database. Secondly, try to choose a compact data type for an index column, based on the number of bytes required for each data type. Integer keys are small and easy for the database to compare. In contrast, strings require a character-by-character comparison.
As a rule of thumb, try to avoid using character columns in an index, particularly primary key indexes. Integer columns will always have an advantage over character fields in ability to boost the performance of a query.
Distinct Index Keys
The most effective indexes are the indexes with a small percentage of duplicated values. Think of having a phone book for a city where 75% of the population has the last name of Smith. A phone book in this area might be easier to use if the entries were sorted by the resident's first names instead. A good index will allow the database to disregard as many records as possible during a search.
An index with a high percentage of unique values is a selective index. Obviously, a unique index is the most selective index of all, because there are no duplicate values. SQL Server will track statistics for indexes and will know how selective each index is. The query optimizer utilizes these statistics when selecting the best index to use for a query.
Maintaining Indexes
In addition to creating an index, we'll need to view existing indexes, and sometimes delete or rename them. This is part of the ongoing maintenance cycle of a database as the schema changes, or even naming conventions change.
View Existing Indexes
A list of all indexes on a table is available in the dialog box we used to create an index. Click on the Selected index drop down control and scroll through the available indexes.
There is also a stored procedure named sp_helpindex. This stored procedure gives all of the indexes for a table, along with all of the relevant attributes. The only input parameter to the procedure is the name of the table, as shown below.
EXEC sp_helpindex Customers
Rename an Index
We can also rename any user created object with the sp_rename stored procedure, including indexes. The sp_rename procedure takes, at a minimum, the current name of the object and the new name for the object. For indexes, the current name must include the name of the table, a dot separator, and the name of the index, as shown below:
EXEC sp_rename 'Products.IX_UnitPrice', 'IX_Price'
This will change the name of the IX_UnitPrice index to IX_Price.
Delete an Index
It is a good idea to remove an index from the database if the index is not providing any benefit. For instance, if we know the queries in an application are no longer searching for records on a particular column, we can remove the index. Unneeded indexes only take up storage space and diminish the performance of modifications. You can remove most indexes with the Delete button on the index dialog box, which we saw earlier. The equivalent SQL command is shown below.
DROP Index Products.IX_Price
Again, we need to use the name of the table and the name of the index, with a dot separator. Some indexes are not so easy to drop, namely any index supporting a unique or primary key constraint. For example, the following command tries to drop the PK_Products index of the Products table.
DROP INDEX Products.PK_Products
Since the database uses PK_Products to enforce a primary key constraint on the Products table, the above command should produce the following error.
An explicit DROP INDEX is not allowed on index 'Products.PK_Products'.
It is being used for PRIMARY KEY constraint enforcement.
Removing a primary key constraint from a table is a redesign of the table, and requires careful thought. It makes sense to know the only way to achieve this task is to either drop the table and use a CREATE TABLE command to recreate the table without the index, or to use the ALTER TABLE command.
Conclusion
In this article we learned how to create, manage, and select indexes for SQL Server tables. Most of what we covered is true for any relational database engine. Proper indexes are crucial for good performance in large databases. Sometimes you can make up for a poorly written query with a good index, but it can be hard to make up for poor indexing with even the best queries.
View
A view is a virtual table that consists of columns from one or more tables. Though it is similar to a table, it is stored in the database. It is a query stored as an object. Hence, a view is an object that derives its data from one or more tables. These tables are referred to as base or underlying tables.
Once you have defined a view, you can reference it like any other table in a database.
A view serves as a security mechanism. This ensures that users are able to retrieve and modify only the data seen by them. Users cannot see or access the remaining data in the underlying tables. A view also serves as a mechanism to simplify query execution. Complex queries can be stored in the form as a view, and data from the view can be extracted using simple queries.
Example
Consider the Publishers table below. If you want users to see only two columns in the table, you can create a view called vwPublishers that will refer to the Publishers table and the two columns required. You can grant Permissions to users to use the view and revoke Permissions from the base Publishers table. This way, users will be able to view only the two columns referred to by the view. They will not be able to query on the Publishers table.
Publishers VW Publishers
Publd | PubName | City | State | Country |
0736 | New Moon Books | Boston | MA | USA |
0877 | Binnet & Hardly | Washington | DC | USA |
1389 | Algodata Infosystems | Berkeley | CA | USA |
1622 | Five Lakes Publishing | Chicago | IL | USA |
Publd | PubName |
0736 | New Moon Books |
0877 | Binnet & Hardly |
1389 | Algodata Infosystems |
1622 | Five Lakes Publishing |
Views ensure the security of data by restricting access to the following data:
· Specific rows of the tables.
· Specific columns of the tables.
· Specific rows and columns of the tables.
· Rows fetched by using joins.
· Statistical summary of data in a given tables.
· Subsets of another view or a subset of views and tables.
Some common examples of views are:
· A subset of rows or columns of a base table.
· A union of two or more tables.
· A join of two or more tables.
· A statistical summary of base tables.
· A subset of another view, or some combination of views and base table.
Creating Views
A view can be created by using the CREATE VIEW statement.
Syntax
CREATE VIEW view_name
[(column_name[,column_name]….)]
[WITH ENCRYPTION]
AS select_statement [WITH CHECK OPTION]
Where:
view_name specifies the name of the view and must follow the rules for identifiers.
column_name specifies the name of the column to be used in view. If the column_name option is not specified, then the view is created with the same columns as specified in the select_statement.
WITH ENCRYPTION encrypts the text for the view in the syscomments table.
AS specifies the actions that will be performed by the view.
select_statement specifies the SELECT Statement that defines a view. The view may use the data contained in other views and tables.
WITH CHECK OPTION forces the data modification statements to fulfill the criteria given in the SELECT statement defining the view. It also ensures that the data is visible after the modifications are made permanent.
The restrictions imposed on views are as follows:
· A view can be created only in the current database.
· The name of a view must follow the rules for identifiers and must not be the same as that of the base table.
· A view can be created only if there is a SELECT permission on its base table.
· A SELECT INTO statement cannot be used in view declaration statement.
· A trigger or an index cannot be defined on a view.
· The CREATE VIEW statement cannot be combined with other SQL statements in a single batch.
Example
CREATE VIEW vwCustomer
AS
SELECT CustomerId, Company Name, Phone
FROM Customers
Creates a view called vwCustomer. Note that the view is a query stored as an object. The data is derived from the columns of the base table Customers.
You use the view by querying the view like a table.
SELECT *FROM vwCUSTOMER
The output of the SELECT statement is:
CustomerId | Company Name | Phone |
ALFKI | Alfreds Futterkiste | 030-0074321 |
ANTON | Antonio Moreno Taqueria | (5)555-3932 |
(91 rows affected)
Getting Information on a View
SQL Server stores information on the view in the following system tables:
· SYSOBJECTS — stores the name of the view.
· SYSCOLUMNS — stores the names of the columns defined in the view.
· SYSDEPENDS — stores information on the view dependencies.
· SYSCOMMENTS — stores the text of the view definition.
There are also certain system-stored procedures that help retrieve information on views. The sp_help system-stored procedure displays view-related information. It displays the view definition, provided the name of the view is given as its parameter.
Example
Sp_helptext vwCustomer
Displays the definition of the vwCustomer view.
Note
If a view is created with the WITH ENCRYPTION option, it cannot view the sp_helptext system-stored procedure.
Altering Views
You can modify a view without dropping it. This ensures that the permission on the view is also not lost. You can modify a view without affecting its dependent objects, such as triggers and stored procedures.
You modify a view using the ALTER VIEW statement.
Syntax
ALTER VIEW view _name [column_ name)]
[WITH ENCRYPTION]
AS select_statement
[WITH CHECK OPTION]
Where:
view_name is the view to be altered.
column_name specifies the name of the column to be used in the view. If the column_name option is not specified then the view is created with the same column as specified in the select_statement.
WITH ENCRYPTION encrypts the text for the view in the SYSCOMMENTS tables.
AS specifies the action that will be performed by the view.
select_statement specifies the SELECT statement that defines a view. The view could use other views and tables.
WITH CHECK OPTION forces the data modification statements to follow the criteria given in the SELECT statement definition.
Example
ALTER VIEW vwCustomer
AS
SELECT CustomerId, Company Name, Phone, Fax
FROM Customers
Alters the vwCustomers view to add the Fax column of the Customers table.
When you query on the view using
SELECT *FROM vwCustomer
You see the following output:
CutomerId | Company Name | Phone, Fax |
ALFKI | Alfreds Futterkiste | 030-0074321, 030-0076545 |
ANTON | Antonio Moreno Taqueria | (5)555-3932 NULL |
AROUT | Around the Horn | (171)555-7788, (171)555-6750 |
(991 rows affected)
Note
If you define a view, a SELECT* statement, and then the structure of the underlying tables by adding columns, the new columns do not appear in the view. When all columns are selected in a CREATE VIEW statement, the column list is interpreted only when you first create a view. To see the new columns in the view, you must alter the view.
Dropping Views
You can drop a view from a database by using the DROP VIEW statement. When a view is dropped, it has no effect on the underlying tables. Dropping a view removes its definition and all the permissions assigned to it. Furthermore, if you query any views that reference a dropped view, you receive an error message. However, dropping a table that references a view does not drop the view automatically. You must drop it explicitly.
Syntax
DROP VIEW view_name
Where:
view_name is the name of the view to be dropped.
You can drop multiple views with a single DROP VIEW statement. The names of the views that need to be dropped are separated by commas in the DROP VIEW statement.
Note
If a view is defined with the select *statement and the base table is altered by adding a few columns, the new columns do not get included in the view. The asterisk(*) symbol is interpreted and expanded only when the view is created. In order to access the new columns via the view, it is necessary to drop the view and recreate it.
If the underlying object is dropped, the views based on the object becomes inactive and any attempt to query data from the view will result in an error message.
Renaming Views
You can rename a view without having to drop it. This ensures that the permissions on the view are not lost.
The guidelines for renaming a view are as follows:
· The view must be in the current database.
· The new name for the view must be followed by the rules for identifiers.
· A view can be renamed only by its owner.
· A view can also be renamed by the owner of the database.
· A view can be renamed by using the sp_rename system stored procedure.
Syntax
Sp_rename_old_viewname, new_viewname
Where:
old_viewname is the view that needs to be renamed.
new_viewname is the new name of the view.
Example
Sp_rename vwCutomers vwCustomerDetails
Renames vwCutomers to vwCustomerDetails.
Manipulating Data Using Views
You can modify data by using a view in only one of its base tables even though a view may be derived from multiple underlying tables. For example, a view vwNew that is derived from two tables, table and table2, can be used to modify either table or table2 in a single statement. A single data modification statement that affected both the underlying tables is not permitted.
You cannot modify the following of Columns using a view:
· Columns that are based on computed values.
· Columns that are based on built_in_function like numeric and string functions.
· Columns that are based on row aggregate functions.
Consider a situation in which a table contains a few Columns that have been defined as NOT NULL, but the view derived from the table does not contain any of these NOT NULL columns. During an INSERT operation, there may be situation where:
All the NOT NULL columns in the base tables are defined with default values.
In the first case, the INSERT operation will be successful because the default values are supplied for the NOT NULL columns. In the second case, the INSERT operation will fail because default values are not supplied for the NOT NULL columns.
Enter the Synonym
SQL Server 2005 introduces the concept of a synonym: a single-part name that can replace a two-, three-, or four-part name in many SQL statements. Using synonyms lets you cut down on typing (always a welcome advance for developers!) and can also provide an abstraction layer to protect you against changes in underlying objects. To understand how this works, let's start with the syntax for creating synonyms. Not surprisingly, this is the CREATE SYNONYM
statement:
CREATE SYNONYM [schema_name.]synonym_name FOR object_name
Here object_name is the name of a SQL Server object (specified fully enough to identify the object) and synonym_name is the new name that you want to assign to it. If you don't specify a schema for the synonym, then SQL Server uses the current user's default schema. The object doesn't need to exist when you create the synonym, because synonyms are late bound: SQL Server only checks the base object when you actually use the synonym.
For example, in the AdventureWorks sample database you can create a synonym this way:
CREATE SYNONYM ProdCat
FOR AdventureWorks.Production.ProductCategory
Having done this, you can now proceed to use the synonym in SQL statements. For example:
SELECT * FROM ProdCat
When you're done with a synonym, you can get rid of it using the DROP SYNONYM
statement, which has the syntax you expect:
DROP SYNONYM [schema_name.]synonym_name
A Few of the Details
Naturally, there are some limits to synonym use. For starters, you can't create a synonym for just anything. SQL Server 2005 contains more objects than ever before, but synonyms are limited to a core set of the most useful objects:
- Tables
- Views
- Stored procedures
- CLR stored procedures
- CLR functions
- Replication filter procedures
- Extended stored procedures
- SQL scalar, table-valued, and inline table-valued functions
There are also limits on where you can use synonyms. Generally speaking, you can use a synonym in these T-SQL statements:
SELECT
sub-selects
UPDATE
INSERT
DELETE
EXECUTE
If you think about it, that second list shouldn't be too surprising: it encompasses the most common statements where you'd use the objects from the first list. Note that you cannot reference a synonym in a DDL statement. If you want to use ALTER TABLE
to change something about the ProductCategory
table, for example, you need to work with the base table, not with the ProdCat
synonym.
To create a synonym, you need CREATE SYNONYM
permission. After a synonym has been created, it has its own associated GRANT
, DENY
, and REVOKE
permissions.
Synonyms as an Abstraction Layer
Besides saving you typing, synonyms can also serve as an abstraction layer over unstable or frequently updated portions of your database. That's because the bindings between a synonym and its base object are only checked at runtime, not at design time.
To illustrate this, examine what happens if you redefine the ProdCat
synonym:
DROP SYNONYM ProdCat
CREATE SYNONYM ProdCat
FOR AdventureWorks.Production.ProductCategory
SELECT * FROM ProdCat
DROP SYNONYM ProdCat
CREATE SYNONYM ProdCat
FOR AdventureWorks.Production.Culture
SELECT * FROM ProdCat
Note that you need to use DROP
and CREATE
whenever you want to change a synonym; there's no ALTER SYNONYM
statement. Figure 2 shows the result of running this batch of statements.
With SQL Server 2000, Microsoft has introduced the concept of User-Defined Functions that allow you to define your own T-SQL functions that can accept 0 or more parameters and return a single scalar data value or a table data type.
What Kind of User-Defined Functions can I Create?
There are three types of User-Defined functions in SQL Server 2000 and they are Scalar, Inline Table-Valued and Multi-statement Table-valued.
How do I create and use a Scalar User-Defined Function?
A Scalar user-defined function returns one of the scalar data types. Text, ntext, image and timestamp data types are not supported. These are the type of user-defined functions that most developers are used to in other programming languages. You pass in 0 to many parameters and you get a return value. Below is an example that is based in the data found in the NorthWind Customers Table.
CREATE FUNCTION whichContinent
(@Country nvarchar(15))
RETURNS varchar(30)
AS
BEGIN
declare @Return varchar(30)
select @return = case @Country
when 'Argentina' then 'South America'
when 'Belgium' then 'Europe'
when 'Brazil' then 'South America'
when 'Canada' then 'North America'
when 'Denmark' then 'Europe'
when 'Finland' then 'Europe'
when 'France' then 'Europe'
else 'Unknown'
end
return @return
end
Because this function returns a scalar value of a varchar(30) this function could be used anywhere a varchar(30) expression is allowed such as a computed column in a table, view, a T-SQL select list item. Below are some of the examples that I was able to use after creating the above function definition. Note that I had to reference the dbo in the function name.
print dbo.WhichContinent('USA')
select dbo.WhichContinent(Customers.Country), customers.* from customers
create table test
(Country varchar(15),
Continent as (dbo.WhichContinent(Country)))
insert into test (country) values ('USA')
select * from test
Country | Continent |
--------------- | ------------------------------ |
USA | North America |
Stored procedures have long given us the ability to pass parameters and get a value back, but the ability to use it in such a variety of different places where you cannot use a stored procedure make this a very powerful database object. Also notice the logic of my function is not exactly brain surgery. But it does encapsulate the business rules for the different continents in one location in my application. If you were to build this logic into T-SQL statements scattered throughout your application and you suddenly noticed that you forgot a country (like I missed Austria!) you would have to make the change in every T-SQL statement where you had used that logic. Now, with the SQL Server User-Defined Function, you can quickly maintain this logic in just one place.
How do I create and use an Inline Table-Value User-Defined Function?
An Inline Table-Value user-defined function returns a table data type and is an exceptional alternative to a view as the user-defined function can pass parameters into a T-SQL select command and in essence provide us with a parameterized, non-updateable view of the underlying tables.
CREATE FUNCTION CustomersByContinent
(@Continent varchar(30))
RETURNS TABLE
AS
RETURN SELECT dbo.WhichContinent(Customers.Country) as continent,
customers.*
from customers
where dbo.WhichContinent(Customers.Country) = @Continent
GO
select * from CustomersbyContinent('North America')
select * from CustomersByContinent('South America')
select * from customersbyContinent('Unknown')
Note that the example uses another function (WhichContinent) to select out the customers specified by the parameter of this function. After creating the user-defined function, I can use it in the FROM clause of a T-SQL command unlike the behavior found when using a stored procedure which can also return record sets. Also note that I do not have to reference the dbo in my reference to this function. However, when using SQL Server built-in functions that return a table, you must now add the prefix :: to the name of the function.
Example from Books Online: Select * from ::fn_helpcollations()
How do I create and use a Multi-statement Table-Value User-Defined Function?
A Multi-Statement Table-Value user-defined function returns a table and is also an exceptional alternative to a view as the function can support multiple T-SQL statements to build the final result where the view is limited to a single SELECT statement. Also, the ability to pass parameters into a T-SQL select command or a group of them gives us the capability to in essence create a parameterized, non-updateable view of the data in the underlying tables. Within the create function command you must define the table structure that is being returned. After creating this type of user-defined function, I can use it in the FROM clause of a T-SQL command unlike the behavior found when using a stored procedure which can also return record sets.
CREATE FUNCTION dbo.customersbycountry (@Country varchar(15))
RETURNS @CustomersbyCountryTab table
([CustomerID] [nchar] (5), [CompanyName] [nvarchar] (40), [ContactName] [nvarchar] (30),
[ContactTitle] [nvarchar] (30), [Address] [nvarchar] (60), [City] [nvarchar] (15),
[PostalCode] [nvarchar] (10), [Country] [nvarchar] (15), [Phone] [nvarchar] (24),
[Fax] [nvarchar] (24))as begin
insert into @CustomersByCountryTab
SELECT [CustomerID],
[CompanyName],
[ContactName],
[ContactTitle],
[Address],
[City],
[PostalCode],
[Country],
[Phone],
[Fax]
FROM [Northwind].[dbo].[Customers]
where country = @Country
declare @cnt int
select @cnt = count(*) from @customersbyCountryTab
if @cnt = 0
insert into @CustomersByCountryTab
([CustomerID],
[CompanyName],
[ContactName],
[ContactTitle],
[Address],
[City],
[PostalCode],
[Country],
[Phone],
[Fax])
values
('','No Companies Found','','','','','','','','')
return
end
SELECT * FROM dbo.customersbycountry('USA')
SELECT * FROM dbo.customersbycountry('CANADA')
SELECT * FROM dbo.customersbycountry('ADF')
Advantages of User Defined Functions
Before SQL 2000, User Defined Functions (UDFs), were not available. Stored Procedures were often used in their place. When advantages or disadvantages of User Defined Functions are discussed, the comparison is usually to Stored Procedures.
One of the advantages of User Defined Functions over Stored Procedures, is the fact that a UDF can be used in a Select, Where, or Case statement. They also can be used to create joins. In addition, User Defined Functions are simpler to invoke than Stored Procedures from inside another SQL statement.
Disadvantages of User Defined Functions
User Defined Functions cannot be used to modify base table information. The DML statements INSERT, UPDATE, and DELETE cannot be used on base tables. Another disadvantage is that SQL functions that return non-deterministic values are not allowed to be called from inside User Defined Functions. GETDATE is an example of a non-deterministic function. Every time the function is called, a different value is returned. Therefore, GETDATE cannot be called from inside a UDF you create.
Transactions
Transactions are an inherent part of any application that collects or manipulates data. SQL Server has to ensure the data integrity. This means that no two users should modify the same piece of data at the same time. Nor should they read "dirty" data—modified but uncommitted rows. This first article in this series on transactions and locking will explain the general terms of transactions and locking in Microsoft SQL Server. The rest of the articles will go into the details of managing transactions, and locking and resolving the blocking problems.
As it applies to SQL Server, a transaction is a single statement or multiple data modification language (DML) statements executed together. All statements in a transaction are treated as a single component of work—either all of them succeed or all of them fail. SQL Server writes the changes to the log file first; if all statements succeed, the transaction is then committed—the net result of changes is saved to the data file. If any of the statements fail, then the whole transaction is rolled back, and none of the changes are saved. For example, consider what happens when Ms. Jones purchases a $1000 diamond ring. The application needs to record the purchase of the ring, adjust Ms. Jones's outstanding balance, and reduce the inventory count. The following code snippet shows what the system would have to do when such a purchase takes place:
BEGIN TRANSACTION
INSERT purchase (
item,
customer,
price,
item_count,
total)
VALUES (
'ring',
'ms. jones',
1000,
1,
1000)
IF @@ERROR <> 0
BEGIN
RAISERROR('error occured while recording purchase', 16, 1)
ROLLBACK
END
UPDATE ring_inventory
SET current_count = current_count - 1
WHERE price = 1000
IF @@ERROR <> 0
BEGIN
RAISERROR('error occured while adjusting inventory', 16, 1)
ROLLBACK
END
UPDATE balance
SET balance = balance + 1000
WHERE customer = 'ms. jones'
IF @@ERROR <> 0
BEGIN
RAISERROR('error occured while adjusting balance', 16, 1)
ROLLBACK
END
COMMIT TRANSACTION
Two-tiered applications in which the business logic is enforced within the database make heavy use of database transactions. In multi-tiered applications, most of the business logic is encapsulated in middle-tier objects. Even so, multi-tiered applications also benefit greatly from transactions—database transaction simply becomes a part of a larger transaction typically initiated by the middle-tier component.
The way transactions are implemented can greatly affect the application's performance. The reason for this is the way SQL Server handles transactions. When a unit of data is about to be modified, SQL Server reserves that unit for exclusive use of the connection that initiated the transaction. In SQL Server, this is referred to as locking. Due to locking, no two users can modify the same piece of data at the same time. Nor can a user read data that is being modified by another user at the same time.
Perhaps we should step back and think about the reasons why locking is so important. Going back to the example above, consider what happens if Ms. Smith decides to purchase exactly the same ring as Ms. Jones. At 6 p.m. on March 2nd, the inventory table shows that the store has 20 rings available in stock—all worth $1000. Ms. Jones hands her charge card to the clerk. The clerk looks up the inventory, which shows 20 rings; in the meantime, Ms. Smith also hands her charge card to another clerk, who also sees 20 rings in the inventory. Both clerks decrement the inventory by one, but the total number of rings inventory now shows 19. What if Ms. White had purchased the same ring a week ago, and brought it back just as the other two ladies were buying the rings? Then the inventory might show 18, 19, 20, or 21, depending on which clerk was first to read data from the inventory table. You get the picture—without locking, our applications would run into numerous problems.
SQL Server supports various levels of lock granularity. The lowest level is a single row. Sometimes, SQL Server doesn't have enough resources to lock each individual row. In such cases, SQL Server can acquire locks on a single data or index page, group of pages, or an entire table. The granularity of locks depends on the memory available to SQL Server. We'll discuss the locking in greater detail in a following article.
By now, you should guess that transactions and locking represent a tradeoff between data integrity and concurrency. If you constantly lock data pages, the rest of the users will have to wait until your data modifications complete. On the other hand, if you don't acquire appropriate locks, the data integrity might be compromised. Understanding SQL Server transactions and locking behavior will get you well on your way to writing highly scalable applications.
Transaction Properties
Each transaction has to abide by the ACID properties, defined in the following table:
ACID Property | Meaning |
Atomicity | Either all or no work is performed. |
Consistency | A transaction must leave data in a consistent state. |
Isolation | Each transaction is independent of all other transactions. That means each transaction will read data that was committed prior to beginning of the other transactions or after the end of the other transactions. |
Durability | After transaction is committed, the data is in a persistent state, regardless of the circumstances. SQL Server records the transaction in the transaction log, and marks it as being committed. If the transaction is not committed, then SQL Server will roll all data changes back. |
SAVE TRAN and Save Points
Savepoints offer a mechanism to roll back portions of transactions. A user can set a savepoint, or marker, within a transaction. The savepoint defines a location to which a transaction can return if part of the transaction is conditionally canceled. SQL Server allows you to use savepoints via the SAVE TRAN
statement, which doesn't affect the @@TRANCOUNT
value. A rollback to a savepoint (not a transaction) doesn't affect the value returned by @@TRANCOUNT
, either. However, the rollback must explicitly name the savepoint: using ROLLBACK TRAN
without a specific name will always roll back the entire transaction.
The following script demonstrates how savepoints can be used :
Collapse
USE pubs
SELECT 'Before BEGIN TRAN main', @@TRANCOUNT
-- The value of @@TRANCOUNT is 0
BEGIN TRAN main
SELECT 'After BEGIN TRAN main', @@TRANCOUNT
-- The value of @@TRANCOUNT is 1
DELETE sales
SAVE TRAN sales -- Mark a save point
SELECT 'After SAVE TRAN sales', @@TRANCOUNT
-- The value of @@TRANCOUNT is still 1
BEGIN TRAN nested
SELECT 'After BEGIN TRAN nested', @@TRANCOUNT
-- The value of @@TRANCOUNT is 2
DELETE titleauthor
SAVE TRAN titleauthor -- Mark a save point
SELECT 'After SAVE TRAN titleauthor', @@TRANCOUNT
-- The value of @@TRANCOUNT is still 2
ROLLBACK TRAN sales
SELECT 'After ROLLBACK TRAN sales', @@TRANCOUNT
-- The value of @@TRANCOUNT is still 2
SELECT TOP 5 au_id FROM titleauthor
IF (@@TRANCOUNT > 0) BEGIN
ROLLBACK TRAN
SELECT 'AFTER ROLLBACK TRAN', @@TRANCOUNT
-- The value of @@TRANCOUNT is 0 because
-- ROLLBACK TRAN always rolls back all transactions and sets @@TRANCOUNT
-- to 0.
END
SELECT TOP 5 au_id FROM titleauthor
Error Handling
The examples presented here are specific to stored procedures as they are the desired method of interacting with a database. When an error is encountered within a stored procedure, the best you can do is halt the sequential processing of the code and either branch to another code segment in the procedure or return processing to the calling application. The @@ERROR
automatic variable is used to implement error handling code. It contains the error ID produced by the last SQL statement executed during a client's connection. When a statement executes successfully, @@ERROR
contains 0. To determine if a statement executes successfully, an IF statement is used to check the value of @@ERROR
immediately after the target statement executes. It is imperative that @@ERROR
be checked immediately after the target statement, because its value is reset to 0 when the next statement executes successfully. If a trappable error occurs, @@ERROR
will have a value greater than 0. SQL Server resets the @@ERROR
value after every successful command, so you must immediately capture the @@ERROR
value. Most of the time, you'll want to test for changes in @@ERROR
right after any INSERT
, UPDATE
, or DELETE
statement.
CREATE PROCEDURE addTitle(@title_id VARCHAR(6), @au_id VARCHAR(11),
@title VARCHAR(20), @title_type CHAR(12))
AS
BEGIN TRAN
INSERT titles(title_id, title, type)
VALUES (@title_id, @title, @title_type)
IF (@@ERROR <> 0) BEGIN
PRINT 'Unexpected error occurred!'
ROLLBACK TRAN
RETURN 1
END
INSERT titleauthor(au_id, title_id)
VALUES (@au_id, @title_id)
IF (@@ERROR <> 0) BEGIN
PRINT 'Unexpected error occurred!'
ROLLBACK TRAN
RETURN 1
END
COMMIT TRAN
RETURN 0
This kind of solution contains substantial repetition especially if your business logic requires more than two Transact-SQL statements to be implemented. A more elegant solution is to group codes into a generic error handling procedure:
CREATE PROCEDURE addTitle(@title_id VARCHAR(6), @au_id VARCHAR(11),
@title VARCHAR(20), @title_type CHAR(12))
AS
BEGIN TRAN
INSERT titles(title_id, title, type)
VALUES (@title_id, @title, @title_type)
IF (@@ERROR <> 0) GOTO ERR_HANDLER
INSERT titleauthor(au_id, title_id)
VALUES (@au_id, @title_id)
IF (@@ERROR <> 0) GOTO ERR_HANDLER
COMMIT TRAN
RETURN 0
ERR_HANDLER:
PRINT 'Unexpected error occurred!'
ROLLBACK TRAN
RETURN 1
Stored Procedures in SQL Server
A Transact-SQL stored procedure is a set of T-SQL code that is stored in a SQL Server database and compiled when used. You create this set of code using the CREATE PROCEDURE command. You can use most Transact-SQL commands in a stored procedure; however, some commands (such as CREATE PROCEDURE, CREATE VIEW, SET SHOWPLAN_TEXT, SET SHOWPLAN_ALL, and so forth) must be the first (or only) statement in a command batch, and therefore aren't allowed in stored procedures. Most Transact-SQL commands behave the same in a stored procedure as they do in a command batch, but some have special capabilities or exhibit different behavior when executed within the context of a stored procedure. Listing 1–1 shows a simple stored procedure (only the code from the CREATE PROCEDURE line down to the ensuing GO actually constitutes the stored procedure):
Microsoft SQL Server provides the stored procedure mechanism to simplify the database development process by grouping Transact-SQL statements into manageable blocks.
Benefits
- Precompiled execution. SQL Server compiles each stored procedure once and then reutilizes the execution plan. This results in tremendous performance boosts when stored procedures are called repeatedly.
- Reduced client/server traffic. If network bandwidth is a concern in your environment, you'll be happy to learn that stored procedures can reduce long SQL queries to a single line that is transmitted over the wire.
- Efficient reuse of code and programming abstraction. Stored procedures can be used by multiple users and client programs. If you utilize them in a planned manner, you'll find the development cycle takes less time.
- Enhanced security controls. You can grant users permission to execute a stored procedure independently of underlying table permissions.
Stored Procedure Advantages
Although you can do most of the things a stored procedure can do with simple ad hoc Transact-SQL code, stored procedures have a number of advantages over ad hoc queries, including
- Execution plan retention and reuse
- Query autoparameterization
- Encapsulation of business rules and policies
- Application modularization
- Sharing of application logic between applications
- Access to database objects that is both secure and uniform
- Consistent, safe data modification
- Network bandwidth conservation
- Support for automatic execution at system start-up
I'll touch on each of these as we go along
Structure
Stored procedures are extremely similar to the constructs seen in other programming languages. They accept data in the form of input parameters that are specified at execution time. These input parameters (if implemented) are utilized in the execution of a series of statements that produce some result. This result is returned to the calling environment through the use of a recordset, output parameters and a return code. That may sound like a mouthful, but you'll find that stored procedures are actually quite simple. Let's take a look at a practical example.
Example
Assume we have the following table named Inventory:
ID | Product | Warehouse | Quantity |
142 | Green Beans | NY | 100 |
214 | Peas | FL | 200 |
825 | Corn | NY | 140 |
512 | Lima Beans | NY | 180 |
491 | Corn | FL | 80 |
379 | Watermelon | FL | 85 |
This information is updated in real-time and warehouse managers are constantly checking the levels of products stored at their warehouse and available for shipment. In the past, each manager would run queries similar to the following:
SELECT Product, Quantity
FROM Inventory
WHERE Warehouse = 'FL'
This resulted in very inefficient performance at the SQL Server. Each time a warehouse manager executed the query, the database server was forced to recompile the query and execute it from scratch. It also required the warehouse manager to have knowledge of SQL and appropriate permissions to access the table information.
We can simplify this process through the use of a stored procedure. Let's create a procedure called sp_GetInventory that retrieves the inventory levels for a given warehouse. Here's the SQL code:
CREATE PROCEDURE sp_GetInventory
@location varchar(10)
AS
SELECT Product, Quantity
FROM Inventory
WHERE Warehouse = @location
Our Florida warehouse manager can then access inventory levels by issuing the command
EXECUTE sp_GetInventory 'FL'
The New York warehouse manager can use the same stored procedure to access that area's inventory.
EXECUTE sp_GetInventory 'NY'
Granted, this is a simple example, but the benefits of abstraction can be seen here. The warehouse manager does not need to understand SQL or the inner workings of the procedure. From a performance perspective, the stored procedure will work wonders. The SQL Sever creates an execution plan once and then reutilizes it by plugging in the appropriate parameters at execution time.
Now that you've learned the benefits of stored procedures, get out there and use them! Try a few examples and measure the performance enhancements achieved -- you'll be amazed!
Execution Plan Compilation and Execution
When you execute a stored procedure for the first time, it's compiled into an execution plan. This plan is not compiled into machine code or even byte codes, but is pseudo-compiled in order to speed execution. By "pseudo-compiled" I mean that object references are resolved, join strategies and indexing selections are made, and an efficient plan for executing the work that the procedure is to carry out is rendered by the SQL Server query optimizer. The optimizer compares a number of potential plans for performing the procedure's work and selects the one it thinks will cost the least in terms of total execution time. It bases this decision on a number of factors, including the estimated I/O cost associated with each plan, the CPU cost, the memory requirements, and so on.
Once an execution plan has been created, it's stored in the procedure cache for future execution. This cache grows and contracts as necessary to store execution plans for the stored procedures and ad hoc queries executed by the server. SQL Server balances the need to supply adequate memory to the procedure cache with the server's other memory requirements, such as providing adequate resources for the data cache. Obviously, memory taken up by cached execution plans can't be used to cache data, so the server manages this carefully. Caching execution plans in memory saves the optimizer from having to construct a new plan each time a procedure is executed, and can improve performance dramatically.
Monitoring Execution
You can inspect the manner in which SQL Server compiles, stores, and runs execution plans using SQL Server's Profiler utility. To observe what happens when you create and run a procedure, follow these steps:
1. Start the Query Analyzer utility, connect to your server, and load the stored procedure script from Listing 1–1 (you can find the complete script on the CD accompanying this book).
2. Start the Profiler utility. You should find it in your Microsoft SQL Server Start|Programs folder.
3. Click the New Trace button and connect to your server.
4. On the Events page, remove every event class from the list on the right except the SQL:BatchStarting event class in the TSQL group.
5. Add every event class in the Stored Procedures group on the left except the SP:StmtStarting and SP:StmtComplete events. (A trace template file that includes these events, BasicTrace.TDF, is on the CD accompanying this book).
6. Click the Run button at the bottom of the Trace Properties dialog.
7. Return to Query Analyzer and run the script.
8. Return to Profiler and click the Stop Selected Trace button. You should see something like the following in the events window:
(Results abridged)
EventClass TextData
--------------------- -----------------------------------
SQL:BatchStarting Use Northwind
SQL:BatchStarting IF OBJECT_ID('dbo.ListCustomersByCi
SQL:BatchStarting CREATE PROCEDURE dbo.ListCustomersB
SQL:BatchStarting EXEC dbo.ListCustomersByCity
SP:CacheMiss
SP:CacheMiss
SP:CacheInsert
SP:Starting EXEC dbo.ListCustomersByCity
SP:Completed EXEC dbo.ListCustomersByCity
The trace output begins with four separate T-SQL command batches. Because the commands are separated by the GO batch terminator, each executes as a separate T-SQL batch. The last batch is the call to the stored procedure via the EXEC command. This call is responsible for the events that follow.
Note the SP:CacheInsert event immediately before the SP:Starting event. In conjunction with the SP:CacheMiss events, this tells us that ListCustomersByCity wasn't in the procedure cache when it was called, so an execution plan was compiled for it and inserted into the cache. The final two events in the trace, the SP:Starting and SP:Completed events, indicate that once the execution plan for the stored procedure was inserted into the cache, it was executed.
To see what happens when a procedure is executed directly from the cache, follow these steps:
1. Click the Start Selected Trace button to restart the trace.
2. Return to Query Analyzer, highlight the EXEC line in the query, and run it by itself.
3. Return to Profiler and stop the trace. You should see something like this:
(Results abridged)
EventClass TextData
-------------------- ----------------------------
SQL:BatchStarting EXEC dbo.ListCustomersByCity
SP:ExecContextHit
SP:Starting EXEC dbo.ListCustomersByCity
SP:Completed EXEC dbo.ListCustomersByCity
The ExecContextHit event tells us that an executable version of the stored procedure was found in the cache. Note the absence of the SP:CacheMiss and CacheInsert events. This tells us that the execution plan that was created and inserted into the cache when we ran the stored procedure the first time is reused when we run it a second time.
Execution Plans
When SQL Server runs an execution plan, each step of the plan is processed and dispatched to an appropriate internal manager process (e.g., the T-SQL manager, the DDL and DML managers, the transaction manager, the stored procedure manager, the utility manager, the ODSOLE manager, and so on). SQL Server calls these managers repeatedly until it has processed all the steps in the execution plan.
Execution plans are never stored permanently. The only portion of a stored procedure that is stored on disk is its source code (in syscomments). Because they're cached in memory, cycling the server disposes of all current execution plans (as does the DBCC FREEPROCCACHE() command).
SQL Server automatically recreates a stored procedure's execution plan when
· The procedure's execution environment differs significantly from its creation environment (see Environmental Issues discussed later in the chapter for more information)
· The sysobjects schema_ver column changes for any of the objects the procedure references. The schema_ver and base_schema_ver columns are updated any time the schema information for a table changes. This includes column additions and deletions, data type changes, constraint additions and deletions, as well as rule and default bindings.
· The statistics have changed for any of the objects the procedure references. This means that the auto-update statistics and auto-create statistics events can cause stored procedure recompilation.
· An index is dropped that was referenced by the procedure's execution plan
· A copy of the procedure's execution plan is not available in the cache. Execution plans are removed from the cache to make room for new plans using a Least Recently Used (LRU) algorithm.
· Certain other specialized circumstances occur, such as when a temporary table is modified a fixed number of times, when DDL and DML statements are interleaved, and when the sp_configure system procedure is called (sp_configure calls DBCC FREEPROCCACHE)
During the earlier discussion on creating procedures and SQL Server's limitation regarding having multiple CREATE TABLE statements for a temporary table in a single procedure, I mentioned that the ad hoc code approach (Listing 1–4) forces the procedure's execution plan to be recompiled while it's running. To see this for yourself, restart the trace we've been using and rerun the stored procedure from that query. You should see something like the following in Profiler:
EventClass TextData
--------------------- ----------------------------------------
SQL:BatchStarting exec testp 2
SQL:StmtStarting exec testp 2
SP:ExecContextHit
SP:Starting exec testp 2
SQL:StmtStarting -- testp CREATE TABLE #temp (k1 int identity)
SQL:StmtStarting -- testp IF @var=1
SQL:StmtStarting -- testp ALTER TABLE #temp ADD c1 varchar(2)
SP:Recompile
SP:CacheMiss
SP:CacheMiss
SP:CacheInsert
SQL:StmtStarting -- testp ALTER TABLE #temp ADD c1 varchar(2)
SQL:StmtStarting -- testp INSERT #temp DEFAULT VALUES
SP:Recompile
SP:CacheMiss
SP:CacheMiss
SP:CacheInsert
SQL:StmtStarting -- testp INSERT #temp DEFAULT VALUES
SQL:StmtStarting -- testp EXEC('SELECT c1 FROM #temp')
SQL:StmtStarting -- Dynamic SQL SELECT c1 FROM #temp
SP:Completed exec testp 2
Notice that not one, but two SP:Recompile events occur during the execution of the procedure: one when the ALTER TABLE is encountered (this statement refers to the temporary table created by the procedure, forcing a recompile) and another when the INSERT is encountered (this statement accesses the newly modified temporary table schema, again forcing a recompile). Assuming you've captured the SQL:StmtStarting or SP:StmtStarting event class in the trace, you'll typically see an SP:Recompile event enveloped in two identical StmtStarting events: The first one indicates that the statement began to be executed, but was put on hold so that the recompile could happen; the second indicates that the statement is actually executing now that the recompile has completed. This starting/stopping activity can have a serious impact on the time it takes the procedure to complete. It's worth pointing out again: Creating a temporary table within a procedure that you then process in other ways will cause the procedure's execution plan to be recompiled (one way to avoid temporary tables is to use local table variables instead). Moreover, interleaving DDL and DML within a procedure can also cause the plan to be recompiled. Because it can cause performance and concurrency problems, you want to avoid causing execution plan recompilation when you can.
Another interesting fact that's revealed by the trace is that the execution plan for the dynamic T-SQL string the procedure creates and executes is not cached. Note that there's no CacheMiss, CacheInsert, CacheHit, or ExecContextHit event corresponding to the dynamic SQL query near the end of the trace log. Let's see what happens when we change the EXEC() call to use sp_executesql instead (Listing 1–13):
Listing 1–13 You can use sp_executesql rather than EXEC() to execute dynamic T-SQL.
USE tempdb
GO
drop proc testp
GO
CREATE PROC testp @var int
AS
CREATE TABLE #temp (k1 int identity)
IF @var=1
ALTER TABLE #temp ADD c1 int
ELSE
ALTER TABLE #temp ADD c1 varchar(2)
INSERT #temp DEFAULT VALUES
EXEC dbo.sp_executesql N'SELECT c1 FROM #temp'
GO
exec testp 2
When you execute the procedure, you should see trace output like this:
EventClass TextData
------------------ -------------------------------------------------------
SQL:BatchStarting exec testp 2
SQL:StmtStarting exec testp 2
SP:CacheMiss
SP:CacheMiss
SP:CacheInsert
SP:Starting exec testp 2
SQL:StmtStarting -- testp CREATE TABLE #temp (k1 int identity)
SQL:StmtStarting -- testp IF @var=1
SQL:StmtStarting -- testp ALTER TABLE #temp ADD c1 varchar(2)
SP:Recompile
SP:CacheMiss
SP:CacheMiss
SP:CacheInsert
SQL:StmtStarting -- testp ALTER TABLE #temp ADD c1 varchar(2)
SQL:StmtStarting -- testp INSERT #temp DEFAULT VALUES
SP:Recompile
SP:CacheMiss
SP:CacheMiss
SP:CacheInsert
SQL:StmtStarting -- testp INSERT #temp DEFAULT VALUES
SQL:StmtStarting -- testp EXEC dbo.sp_executesql N'SELECT c1 FROM #temp'
SP:CacheMiss
SP:CacheMiss
SP:CacheInsert SELECT c1 FROM #temp
SQL:StmtStarting SELECT c1 FROM #temp
SP:Completed exec testp 2
Note the SP:CacheInsert event that occurs for the dynamic SELECT statement now that we are calling it via sp_executesql. This indicates that the execution plan for the SELECT statement has been inserted into the cache so that it can be reused later. Whether it actually will be reused is another matter, but at least the possibility exists that it can be. If you run the procedure a second time, you'll see that the call to sp_executesql itself generates an ExecContextHit event rather than the CacheMiss event it causes the first time around. By using sp_executesql, we've been able to use the procedure cache to make the procedure run more efficiently. The moral of the story is this: sp_executesql is generally a more efficient (and therefore faster) method of executing dynamic SQL than EXEC().
Forcing Plan Recompilation
You can also force a procedure's execution plan to be recompiled by
· Creating the procedure using the WITH RECOMPILE option
· Executing the procedure using the WITH RECOMPILE option
· Using the sp_recompile system procedure to "touch" any of the tables the procedure references (sp_recompile merely updates sysobjects' schema_ver column)
Once an execution plan is in the cache, subsequent calls to the procedure can reuse the plan without having to rebuild it. This eliminates the query tree construction and plan creation that normally occur when you execute a stored procedure for the first time, and is the chief performance advantage stored procedures have over ad hoc T-SQL batches.
Automatically Loading Execution Plans
A clever way of loading execution plans into the cache at system start-up is to execute them via an autostart procedure. Autostart procedures must reside in the master database, but they can call procedures that reside in other databases, forcing those procedures' plans into memory as well. If you're going to take this approach, creating a single autostart procedure that calls the procedures you want to load into the cache rather than autostarting each procedure individually will conserve execution threads (each autostart routine gets its own thread).
TIP
Environmental Issues
A number of SQL Server environmental settings affect the behavior of stored procedures. You specify most of these via SET commands. They control the way that stored procedures handle nulls, quotes, cursors, BLOB fields, and so forth. Two of these—QUOTED_IDENTIFIER and ANSI_NULLS—are stored permanently in each procedure's status field in sysobjects, as I mentioned earlier in the chapter. That is, when you create a stored procedure, the status of these two settings is stored along with it. QUOTED_IDENTIFIER controls whether strings within double quotes are interpreted as object identifiers (e.g., table or column references), and ANSI_NULLS controls whether non-ANSI equality comparisons with NULLs are allowed.
SET QUOTED_IDENTIFIER is normally used with a stored procedure to allow the procedure to reference objects with names that contain reserved words, spaces, or other illegal characters. An example is provided in Listing 1–20.
Listing 1–20 SET QUOTED_IDENTIFIER allows references to objects with names with embedded spaces.
USE Northwind
SET QUOTED_IDENTIFIER ON
GO
IF OBJECT_ID('dbo.listorders') IS NOT NULL
DROP PROC dbo.listorders
GO
CREATE PROC dbo.listorders
AS
SELECT * FROM "Order Details"
GO
SET QUOTED_IDENTIFIER OFF
GO
EXEC dbo.listorders
(Results abridged)
OrderID ProductID UnitPrice Quantity Discount
----------- ----------- --------------------- -------- -------------
10248 11 4.0000 12 0.0
10248 42 9.8000 10 0.0
10248 72 34.8000 5 0.0
10249 14 18.6000 9 0.0
10249 51 42.4000 40 0.0
10250 41 7.7000 10 0.0
"Order Details" contains both a reserved word and a space, so it can't be referenced without special handling. In this case, we turned on quoted identifier support and enclosed the table name in double quotes, but a better way would be to use SQL Server's square brackets ( [ ] ) to enclose the name (e.g., [Order Details]) because this alleviates the need to change any settings. Note that bracketed object names are not supported by the ANSI/ISO SQL standard.
The ANSI_NULLS setting is even more useful to stored procedures. It controls whether non-ANSI equality comparisons with NULLs work properly. This is particularly important with stored procedure parameters that can receive NULL values. See Listing 1–21 for an example:
Listing 1–21 SET ANSI_NULLS allows comparisons between variables or columns and NULL values to work as you would expect.
USE Northwind
IF (OBJECT_ID('dbo.ListRegionalEmployees') IS NOT NULL)
DROP PROC dbo.ListRegionalEmployees
GO
SET ANSI_NULLS OFF
GO
CREATE PROC dbo.ListRegionalEmployees @region nvarchar(30)
AS
SELECT EmployeeID, LastName, FirstName, Region FROM employees
WHERE Region=@region
GO
SET ANSI_NULLS ON
GO
EXEC dbo.ListRegionalEmployees NULL
(Results)
EmployeeID LastName FirstName Region
----------- -------------------- ---------- ---------------
5 Buchanan Steven NULL
6 Suyama Michael NULL
7 King Robert NULL
9 Dodsworth Anne NULL
Thanks to SET ANSI_NULLS, the procedure can successfully compare a NULL @region with the region column in the Northwind Employees table. The query returns the rows that have NULL region values because, contrary to the ANSI SQL specification, SQL Server checks the NULL variable against the column for equality. The handiness of this becomes more evident when a procedure defines a large number of "NULL-able" parameters. Without the ability to test NULL values for equality in a manner identical to non-NULL values, each NULL-able parameter would require special handling (perhaps using the IS NULL predicate), very likely multiplying the amount of code necessary to process query parameters.
Because SQL Server stores the QUOTED_IDENTIFIER and ANSI_NULLS settings with each stored procedure, you can trust them to have the values you require when a procedure runs. The server restores them to the values they had when the procedure was created each time the procedure runs, then resets them afterward. Here's an example:
SET ANSI_NULLS ON
EXEC dbo.ListRegionalEmployees NULL
The stored procedure still executes as though ANSI_NULLS is set to OFF. Note that you can check the saved status of a procedure's QUOTED_IDENTIFIER and ANSI_NULLS settings via the OBJECTPROPERTY() function. An example is provided in Listing 1–22:
Listing 1–22 You can check the ANSI_NULLS and QUOTED_IDENTIFIER status for a procedure using the OBJECTPROPERTY function.
USE Northwind
SELECT OBJECTPROPERTY(OBJECT_ID('dbo.ListRegionalEmployees'), 'ExecIsAnsiNullsOn')
AS 'AnsiNulls'
(Results)
AnsiNulls
-----------
0
A number of other environmental commands affect how stored procedures execute. SET XACT_ABORT, SET CURSOR_CLOSE_ON_COMMIT, SET TEXTSIZE, SET IMPLICIT_TRANSACTIONS, and numerous others help determine how a stored procedure behaves when executed. If you have a stored procedure that requires a SET command to have a particular value to run properly, set it to that value as early as possible in the procedure and document why it's necessary via comments.
Automatic Variables, a.k.a. System Functions
By their very nature, automatic variables, also known as system functions, are usually the province of stored procedures. This makes most of them germane in some way to a discussion about stored procedures. Several, in fact, are used almost exclusively in stored procedures. Table 1–1 summarizes them.
Table 1–1 Stored Procedure-Related Functions
Function | Returns |
@@FETCH_STATUS | The status of the last FETCH operation |
@@NESTLEVEL | The current procedure nesting level |
@@OPTIONS | A bitmap of the currently specified user options |
@@PROCID | The object ID of the current procedure |
@@SPID | The process ID of the current process |
@@TRANCOUNT | The current transaction nesting level |
Errors
The @@ERROR automatic variable returns the error code of the last Transact-SQL statement. If there was no error, @@ERROR returns zero. Because @@ERROR is reset after each Transact-SQL statement, you must save it to a variable if you wish to process it further after checking it.
If you want to write robust code that runs for years without having to be reengineered, make a habit of checking @@ERROR often in your stored procedures, especially after data modification statements. A good indicator of resilient code is consistent error checking, and until Transact-SQL supports structured exception handling, checking @@ERROR frequently is the best way to protect your code against unforeseen circumstances.
Error Messages
The system procedure sp_addmessage adds custom messages to the sysmessages table that can then be raised (returned to the client) by the RAISERROR command. User messages should have error numbers of 50,000 or higher. The chief advantage of using SQL Server's system messages facility is internationalization. Because you specify a language ID when you add a message via sp_addmessage, you can add a separate version of your application's messages for each language it supports. When your stored procedures then reference a message by number, the appropriate message will be returned to your application using SQL Server's current language setting.
RAISERROR
Stored procedures report errors to client applications via the RAISERROR command. RAISERROR doesn't change the flow of a procedure; it merely displays an error message, sets the @@ERROR automatic variable, and optionally writes the message to the SQL Server error log and the NT application event log. RAISERROR can reference an error message added to the sysmessages table via the sp_addmessage system procedure, or you can supply it a message string of your own. If you pass a custom message string to RAISERROR, the error number is set to 50,000; if you raise an error by number using a message ID in the sysmessages table, @@ERROR is assigned the message number you raise. RAISERROR can format messages similarly to the C printf() function, allowing you to supply your own arguments for the error messages it returns.
Both a severity and a state can be specified when raising an error message with RAISERROR. Severity values less than 16 produce informational messages in the application event log (when logged). A severity of 16 produces a warning message in the event log. Severity values greater than 16 produce error messages in the event log. Severity values up through 18 can be raised by any user; severity values 19 through 25 are reserved for members of the sysadmin role and require the use of the WITH LOG option. Severity values of 20 and higher are considered fatal and cause the client connection to be terminated.
State has no predefined meaning to SQL Server; it's an informational value that you can use to return state information to an application. Raising an error with a state of 127 will cause the ISQL and OSQL utilities to set the operating system ERRORLEVEL variable to the error number returned by RAISERROR.
The WITH LOG option copies the error message to the NT event log (if SQL Server is running on Windows NT, Windows 2000, or Windows XP) and the SQL Server error log regardless of whether the message was defined using the with_log option of sp_addmessage. The WITH NOWAIT option causes the message to be returned immediately to the client. The WITH SETERROR option forces @@ERROR to return the last error number raised, regardless of the severity of the error message. See Chapter 7 for detailed examples of how to use RAISERROR(), @@ERROR, and SQL Server's other error-handling mechanisms.
Nesting
You can nest stored procedure calls up to 32 levels deep. Use the @@NESTLEVEL automatic variable to check the nesting level from within a stored procedure or trigger. From a command batch, @@NESTLEVEL returns 0. From a stored procedure called from a command batch and from first-level triggers, @@NESTLEVEL returns 1. From a procedure or trigger called from nesting level 1, @@NESTLEVEL returns 2; procedures called from level 2 procedures return level 3, and so on. Objects (including temporary tables) and cursors created within a stored procedure are visible to all objects it calls. Objects and cursors created in a command batch are visible to all the objects referenced in the command batch.
Recursion
Because Transact-SQL supports recursion, you can write stored procedures that call themselves. Recursion can be defined as a method of problem solving wherein the solution is arrived at by repetitively applying it to subsets of the problem. A common application of recursive logic is to perform numeric computations that lend themselves to repetitive evaluation by the same processing steps. Listing 1–31 presents an example that features a stored procedure that calculates the factorial of a number:
Listing 1–31 Stored procedures can call themselves recursively.
SET NOCOUNT ON
USE master
IF OBJECT_ID('dbo.sp_calcfactorial') IS NOT NULL
DROP PROC dbo.sp_calcfactorial
GO
CREATE PROC dbo.sp_calcfactorial @base_number decimal(38,0), @factorial
decimal(38,0) OUT
AS
SET NOCOUNT ON
DECLARE @previous_number decimal(38,0)
IF ((@base_number>26) and (@@MAX_PRECISION<38)) OR (@base_number>32) BEGIN
RAISERROR('Computing this factorial would exceed the server''s max. numeric
precision of %d or the max. procedure nesting level of 32',16,10,@@MAX_PRECISION)
RETURN(-1)
END
IF (@base_number<0) BEGIN
RAISERROR('Can''t calculate negative factorials',16,10)
RETURN(-1)
END
IF (@base_number<2) SET @factorial=1 -- Factorial of 0 or 1=1
ELSE BEGIN
SET @previous_number=@base_number-1
EXEC dbo.sp_calcfactorial @previous_number, @factorial OUT -- Recursive call
IF (@factorial=-1) RETURN(-1) -- Got an error, return
SET @factorial=@factorial*@base_number
IF (@@ERROR<>0) RETURN(-1) -- Got an error, return
END
RETURN(0)
GO
DECLARE @factorial decimal(38,0)
EXEC dbo.sp_calcfactorial 32, @factorial OUT
SELECT @factorial
The procedure begins by checking to make sure it has been passed a valid number for which to compute a factorial. It then recursively calls itself to perform the computation. With the default maximum numeric precision of 38, SQL Server can handle numbers in excess of 263 decillion. (Decillion is the U.S. term for 1 followed by 33 zeros. In Great Britain, France, and Germany, 1 followed by 33 zeros is referred to as 1,000 quintillion.) As you'll see in Chapter 11, UDFs functions are ideal for computations like factorials.
SQL Cursors
In this article, I want to tell you how to create and use server side cursors and how you can optimize a cursor performance.
Cursor is a database object used by applications to manipulate data in a set on a row-by-row basis, instead of the typical SQL commands that operate on all the rows in the set at one time. For example, you can use cursor to include a list of all user databases and make multiple operations against each database by passing each database name as a variable.
The server side cursors were first added in SQL Server 6.0 release and now supported in all editions of SQL Server 7.0 and SQL Server 2000.
Before using cursor, you first must declare the cursor. Once a cursor has been declared, you can open it and fetch from it. You can fetch row by row and make multiple operations on the currently active row in the cursor. When you have finished working with a cursor, you should close cursor and deallocate it to release SQL Server resources.
Declaring a Cursor
Before using cursor, you first must declare the cursor, i.e. define its scrolling behavior and the query used to build the result set on which the cursor operates. To declare cursor, you can use a syntax based on the SQL-92 standard and a syntax using a set of Transact-SQL extensions.
SQL-92 Syntax
This is SQL-92 Syntax:
DECLARE cursor_name [INSENSITIVE] [SCROLL] CURSOR FOR select_statement [FOR {READ ONLY | UPDATE [OF column_name [,...n]]}] |
where
cursor_name - the name of the server side cursor, must contain from 1 to 128 characters.
INSENSITIVE - specifies that cursor will use a temporary copy of the data instead of base tables. This cursor does not allow modifications and modifications made to base tables are not reflected in the data returned by fetches made to this cursor.
SCROLL - specifies that cursor can fetch data in all directions, not only sequentially until the end of the result set. If this argument is not specified, FETCH NEXT is the only fetch option supported.
select_statement - the standard select statement, cannot contain COMPUTE, COMPUTE BY, FOR BROWSE, and INTO keywords.
READ ONLY - specifies that cursor cannot be updated.
UPDATE [OF column_name [,...n]] - specifies that all cursor's columns can be updated (if OF column_name [,...n] is not specified), or only the columns listed in the OF column_name [,...n] list allow modifications.
Cursor Options Compatibility
| INSENSITIVE | SCROLL | READ ONLY | UPDATE |
INSENSITIVE |
| Yes | Yes | No |
SCROLL | Yes |
| Yes | Yes |
READ ONLY | Yes | Yes |
| No |
UPDATE | No | Yes | No |
|
Transact-SQL Extended Syntax
This is Transact-SQL Extended Syntax:
DECLARE cursor_name CURSOR [LOCAL | GLOBAL] [FORWARD_ONLY | SCROLL] [STATIC | KEYSET | DYNAMIC | FAST_FORWARD] [READ_ONLY | SCROLL_LOCKS | OPTIMISTIC] [TYPE_WARNING] FOR select_statement [FOR UPDATE [OF column_name [,...n]]] |
where
cursor_name - the name of the server side cursor, must contain from 1 to 128 characters.
LOCAL - specifies that cursor can be available only in the batch, stored procedure, or trigger in which the cursor was created. The LOCAL cursor will be implicitly deallocated when the batch, stored procedure, or trigger terminates.
GLOBAL - specifies that cursor is global to the connection. The GLOBAL cursor will be implicitly deallocated at disconnect.
FORWARD_ONLY - specifies that cursor can only fetch data sequentially from the first to the last row. FETCH NEXT is the only fetch option supported.
STATIC - specifies that cursor will use a temporary copy of the data instead of base tables. This cursor does not allow modifications and modifications made to base tables are not reflected in the data returned by fetches made to this cursor.
KEYSET - specifies that cursor uses the set of keys that uniquely identify the cursor's rows (keyset), so that the membership and order of rows in the cursor are fixed when the cursor is opened. SQL Server uses a table in tempdb to store keyset. The KEYSET cursor allows updates nonkey values from being made through this cursor, but inserts made by other users are not visible. Updates nonkey values made by other users are visible as the owner scrolls around the cursor, but updates key values made by other users are not visible. If a row is deleted, an attempt to fetch the row returns an @@FETCH_STATUS of -2.
DYNAMIC - specifies that cursor reflects all data changes made to the base tables as you scroll around the cursor. FETCH ABSOLUTE option is not supported with DYNAMIC cursor.
FAST_FORWARD - specifies that cursor will be FORWARD_ONLY and READ_ONLY cursor. The FAST_FORWARD cursors produce the least amount of overhead on SQL Server.
READ ONLY - specifies that cursor cannot be updated.
SCROLL_LOCKS - specifies that cursor will lock the rows as they are read into the cursor to ensure that positioned updates or deletes made through the cursor will be succeed.
OPTIMISTIC - specifies that cursor does not lock rows as they are read into the cursor. So, the positioned updates or deletes made through the cursor will not succeed if the row has been updated outside the cursor since this row was read into the cursor.
TYPE_WARNING - specifies that if the cursor will be implicitly converted from the requested type to another, a warning message will be sent to the client.
select_statement - the standard select statement, cannot contain COMPUTE, COMPUTE BY, FOR BROWSE, and INTO keywords.
UPDATE [OF column_name [,...n]] - specifies that all cursor's columns can be updated (if OF column_name [,...n] is not specified), or only the columns listed in the OF column_name [,...n] list allow modifications.
Cursor Options Compatibility
| (L) | (G) | (FO) | (S) | (K) | (D) | (FF) | (RO) | (SL) | (O) | (TW) | (U) |
LOCAL (L) |
| No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
GLOBAL (G) | No |
| Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
FORWARD_ONLY (FO) | Yes | Yes |
| Yes | Yes | Yes | No | Yes | Yes | Yes | Yes | Yes |
STATIC (S) | Yes | Yes | Yes |
| No | No | No | Yes | No | Yes | Yes | No |
KEYSET (K) | Yes | Yes | Yes | No |
| No | No | Yes | Yes | Yes | Yes | Yes |
DYNAMIC (D) | Yes | Yes | Yes | No | No |
| No | Yes | Yes | Yes | Yes | Yes |
FAST_FORWARD (FF) | Yes | Yes | No | No | No | No |
| Yes | No | No | Yes | No |
READ_ONLY (RO) | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| No | No | Yes | No |
SCROLL_LOCKS (SL) | Yes | Yes | Yes | No | Yes | Yes | No | No |
| No | Yes | Yes |
OPTIMISTIC (O) | Yes | Yes | Yes | Yes | Yes | Yes | No | No | No |
| Yes | Yes |
TYPE_WARNING (TW) | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Yes |
UPDATE (U) | Yes | Yes | Yes | No | Yes | Yes | No | No | Yes | Yes | Yes |
|
Opening a Cursor
Once a cursor has been declared, you must open it to fetch data from it. To open a cursor, you can use the following syntax:
OPEN { { [GLOBAL] cursor_name } | cursor_variable_name} |
where
GLOBAL - if this argument was not specified and both a global and a local cursor exist with the same name, the local cursor will be opened; otherwise, the global cursor will be opened.
cursor_name - the name of the server side cursor, must contain from 1 to 128 characters.
cursor_variable_name - the name of a cursor variable that references a cursor.
After a cursor is opening, you can determine the number of rows that were found by the cursor. To get this number, you can use @@CURSOR_ROWS scalar function.
Fetching a Cursor
Once a cursor has been opened, you can fetch from it row by row and make multiple operations on the currently active row in the cursor. To fetch from a cursor, you can use the following syntax:
FETCH [ [ NEXT | PRIOR | FIRST | LAST | ABSOLUTE {n | @nvar} | RELATIVE {n | @nvar} ] FROM ] { { [GLOBAL] cursor_name } | @cursor_variable_name} [INTO @variable_name[,...n] ] |
where
NEXT - the default cursor fetch option. FETCH NEXT returns the next row after the current row.
PRIOR - returns the prior row before the current row.
FIRST - returns the first row in the cursor.
LAST - returns the last row in the cursor.
ABSOLUTE {n | @nvar} - returns the nth row in the cursor. If a positive number was specified, the rows are counted from the top of the data set; if 0 was specified, no rows are returned; if a negative number was specified, the number of rows will be counted from the bottom of the data set.
RELATIVE {n | @nvar} - returns the nth row in the cursor relative to the current row. If a positive number was specified, returns the nth row beyond the current row; if a negative number was specified, returns the nth row prior the current row; if 0 was specified, returns the current row.
GLOBAL - if this argument was not specified and both a global and a local cursor exist with the same name, the local cursor will be fetched; otherwise, the global cursor will be fetched.
cursor_name - the name of the server side cursor, must contain from 1 to 128 characters.
cursor_variable_name - the name of a cursor variable that references a cursor.
INTO @variable_name[,...n] - allows data returned from the cursor to be held in temporary variables. The type of variables must match the type of columns in the cursor select list or support implicit conversion. The number of variables must match the number of columns in the cursor select list.
Closing a Cursor
When you have finished working with a cursor, you can close it to release any resources and locks that SQL Server may have used while the cursor was open.
To close a cursor, you can use the following syntax:
CLOSE { { [GLOBAL] cursor_name } | cursor_variable_name } |
where
GLOBAL - if this argument was not specified and both a global and a local cursor exist with the same name, the local cursor will be closed; otherwise, the global cursor will be closed.
cursor_name - the name of the server side cursor, must contain from 1 to 128 characters.
cursor_variable_name - the name of a cursor variable that references a cursor.
Note. If you have closed a cursor, but have not deallocated it, you can open it again when needed.
Deallocating a Cursor
When you have finished working with a cursor and want to completely release SQL Server resources that were used by a cursor, you can deallocate a cursor.
To deallocate a cursor, you can use the following syntax:
DEALLOCATE { { [GLOBAL] cursor_name } | @cursor_variable_name} |
where
GLOBAL - if this argument was not specified and both a global and a local cursor exist with the same name, the local cursor will be deallocated; otherwise, the global cursor will be deallocated.
cursor_name - the name of the server side cursor, must contain from 1 to 128 characters.
cursor_variable_name - the name of a cursor variable that references a cursor.
Note. Deallocating a cursor completely removes all cursor references. So, after a cursor is deallocated, it no longer can be opened.
Cursor Optimization Tips
- Try to avoid using SQL Server cursors, whenever possible.
SQL Server cursors can results in some performance degradation in comparison with select statements. Try to use correlated subquery or derived tables, if you need to perform row-by-row operations. - Do not forget to close SQL Server cursor when its result set is not needed.
To close SQL Server cursor, you can use CLOSE {cursor_name} command. This command releases the cursor result set and frees any cursor locks held on the rows on which the cursor is positioned. - Do not forget to deallocate SQL Server cursor when the data structures comprising the cursor are not needed.
To deallocate SQL Server cursor, you can use DEALLOCATE {cursor_name} command. This command removes a cursor reference and releases the data structures comprising the cursor. - Try to reduce the number of records to process in the cursor.
To reduce the cursor result set, use the WHERE clause in the cursor's select statement. It can increase cursor performance and reduce SQL Server overhead. - Try to reduce the number of columns to process in the cursor.
Include in the cursor's select statement only necessary columns. It will reduce the cursor result set. So, the cursor will use fewer resources. It can increase cursor performance and reduce SQL Server overhead. - Use READ ONLY cursors, whenever possible, instead of updatable cursors.
Because using cursors can reduce concurrency and lead to unnecessary locking, try to use READ ONLY cursors, if you do not need to update cursor result set. - Try avoid using insensitive, static and keyset cursors, whenever possible.
These types of cursor produce the largest amount of overhead on SQL Server, because they cause a temporary table to be created in TEMPDB, which results in some performance degradation. - Use FAST_FORWARD cursors, whenever possible.
The FAST_FORWARD cursors produce the least amount of overhead on SQL Server, because there are read-only cursors and can only be scrolled from the first to the last row. Use FAST_FORWARD cursor if you do not need to update cursor result set and the FETCH NEXT will be the only used fetch option. - Use FORWARD_ONLY cursors, if you need updatable cursor and the FETCH NEXT will be the only used fetch option.
If you need read-only cursor and the FETCH NEXT will be the only used fetch option, try to use FAST_FORWARD cursor instead of FORWARD_ONLY cursor. By the way, if one of the FAST_FORWARD or FORWARD_ONLY is specified the other cannot be specified.
Candidate Keys
The first type of key you establish for a table is the candidate key, which is a field or set of fields that uniquely identifies a single instance of the table's subject. Each table must have at least one candidate key. You'll eventually examine the table's pool of available candidate keys and designate one of them as the official primary key for the table.
Before you can designate a field as a candidate key, you must make certain it complies with all of the Elements of a Candidate Key. These elements constitute a set of guidelines you can use to determine whether the field is fit to serve as a candidate key. You cannot designate a field as a candidate key if it fails to conform to any of these elements.
Artificial Candidate Keys
When you determine that a table does not contain a candidate key, you can create and use an artificial (or surrogate) candidate key. (It's artificial in the sense that it didn't occur "naturally" in the table; you have to manufacture it.) You establish an artificial candidate key by creating a new field that conforms to all of the Elements of a Candidate Key and then adding it to the table; this field becomes the official candidate key.
Primary Keys
By now, you've established all the candidate keys that seem appropriate for every table. Your next task is to establish a primary key for each table, which is the most important key of all.
- A primary key field exclusively identifies the table throughout the database structure and helps establish relationships with other tables. (You'll learn more about this in Chapter 10.)
- A primary key value uniquely identifies a given record within a table and exclusively represents that record throughout the entire database. It also helps to guard against duplicate records.
Alternate Keys
Now that you've selected a candidate key to serve as the primary key for a particular table, you'll designate the remaining candidate keys as alternate keys. These keys can be useful to you in an RDBMS program because they provide an alternative means of uniquely identifying a particular record within the table. If you choose to use an alternate key in this manner, mark its name with "AK" or "CAK" (composite alternate key) in the table structure; otherwise, remove its designation as an alternate key and simply return it to the status of a normal field. You won't be concerned with alternate keys for the remainder of the database- design process, but you will work with them once again as you implement the database in an RDBMS program. (Implementing and using alternate keys in RDBMS programs is beyond the scope of this work—our only objective here is to designate them as appropriate. This is in line with the focus of the book, which is the logical design of a database.)
How many SQL 2000 editions exist and what is the difference between them?
Answer:
SQL Server 2000 has the following editions:
Personal Edition
Standard Edition
Enterprise Edition
Developer Edition
Desktop Engine
SQL Server CE
Evaluation Edition
Personal Edition can work on the Windows 98, Windows NT Server 4.0 with Service Pack 5 or later, Windows NT Workstation 4.0 with Service Pack 5 or later and on the all editions of Windows 2000. This edition is related to SQL Server 7.0 Desktop Edition.
This edition has some restrictions:
maximum 2 CPU
no Distributed Partitioned Views
no Log Shipping
no Parallel DBCC
no Parallel index creation
no Failover clustering
no publishing for transaction replication
maximum 2Gb RAM
Standard Edition can work on the Windows NT Server 4.0 with Service Pack 5, Windows NT Server 4.0 Enterprise Edition and on the Windows 2000 Server/Advanced Server/DataCenter.
This edition has the following restrictions:
maximum 4 CPU (up to 8 CPU on the Windows NT Enterprise Edition)
no Distributed Partitioned Views
no Log Shipping
no Parallel index creation
no Failover clustering
maximum 2Gb RAM
Enterprise Edition can work on the Windows NT Server 4.0 with Service Pack 5, Windows NT Server 4.0 Enterprise Edition and on the Windows 2000 Server/Advanced Server/DataCenter.
This edition can use:
- up to 32 CPU on the Windows 2000 DataCenter up to 8 CPU on the Windows 2000 Advanced Server and on the Windows NT Server 4.0 Enterprise Edition up to 4 CPU on the Windows NT Server 4.0 and on the Windows 2000 Server
- up to 64Gb RAM on the Windows 2000 DataCenter up to 8 Gb RAM on the Windows 2000 Advanced Server up to 4 Gb RAM on the Windows 2000 Server up to 3 Gb RAM on the Windows NT Server 4.0 Enterprise Edition up to 2 Gb RAM on the Windows NT Server 4.0
- Distributed Partitioned Views
- Log Shipping
- Parallel index creation
- Failover clustering
The Developer Edition can be used by developers to create and debug stored procedures and triggers. This edition comes with its own compact disc and can be upgraded to SQL Server 2000 Enterprise Edition.
The Desktop Engine has no graphical user interface and is related to the MSDE, not to the SQL Server 7.0 Desktop Edition. The size of Desktop Engine databases cannot exceed 2 GB. The Desktop Engine can use maximum 2 CPU.
The SQL Server CE edition can work only on the Microsoft Windows CE, so it has all restrictions of this operation system (can use only 1 CPU, no Parallel index creation, no Full-Text Search and so on).
The Evaluation Edition can be used only for the test purposes to learn more about the new features and enhancements and should be uninstalled after a 120-day evaluation period.
FAQ
What is normalization? Explain different levels of normalization?
Check out the article Q100139 from Microsoft knowledge base and of course, there's much more information available in the net. It'll be a good idea to get a hold of any RDBMS fundamentals text book,
especially the one by C. J. Date. Most of the times, it will be okay if you can explain till third normal form.
What is denormalization and when would you go for it?
As the name indicates, denormalization is the reverse process of normalization. It's the controlled introduction of redundancy in to the database design. It helps improve the query performance as the
number of joins could be reduced.
How do you implement one-to-one, one-to-many and many-to-many relationships while designing tables?
One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships. One-to-Many relationships are implemented by splitting the data into
two tables with primary key and foreign key relationships. Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the
junction table.It will be a good idea to read up a database designing fundamentals
text book.
What's the difference between a primary key and a unique key?
Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where are unique creates a nonclustered index by
default. Another major difference is that, primary key doesn't allow NULLs, but unique key allows one NULL only.
What are user defined datatypes and when you should go for them?
User defined datatypes let you extend the base SQL Server datatypes by providing a descriptive name, and format to the database. Take for example, in your database, there is a column called Flight_Num which appears in many tables. In all these tables it should be varchar(8). In this case you could create a user defined datatype called Flight_num_type of varchar(8) and use it across all your tables.
What is bit datatype and what's the information that can be stored inside a bit column?
Bit datatype is used to store boolean information like 1 or 0 (true or false). Untill SQL Server 6.5 bit datatype could hold either a 1 or 0 and there was no support for NULL. But from SQL Server 7.0 onwards,
bit datatype can represent a third state, which is NULL.
Define candidate key, alternate key, composite key.
A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes the primary key of the table. If the table has more than one candidate key, one of them will become the
primary key, and the rest are called alternate keys. A key formed by combining at least two or more columns is called
composite key.
What are defaults? Is there a column to which a default can't be bound?
A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns and timestamp columns can't have defaults bound to them. See CREATE
DEFUALT in books online.
SQL Server architecture
What is a transaction and what are ACID properties?
A transaction is a logical unit of work in which, all the steps must be performed or none. ACID stands for Atomicity, Consistency, Isolation, Durability. These are the properties of a transaction. For
more information and explanation of these properties, see SQL Server books online or any RDBMS fundamentals text book.
Explain different isolation levels
An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level is Read Committed. Here are the other isolation levels (in the ascending
order of isolation): Read Uncommitted, Read Committed, Repeatable Read, Serializable. See SQL Server books online for an explanation of the isolation levels. Be sure to read about SET TRANSACTION ISOLATION LEVEL, which lets you customize the isolation level at the connection level.CREATE INDEX myIndex ON myTable(myColumn)
What type of Index will get created after executing the above statement?
Non-clustered index. Important thing to note: By default a clustered index gets created on the primary key, unless specified otherwise.
What's the maximum size of a row?
8060 bytes. Don't be surprised with questions like 'what is the maximum number of columns per table'. Check out SQL Server books online for the page titled: "Maximum Capacity Specifications".
Explain Active/Active and Active/Passive cluster configurations
Hopefully you have experience setting up cluster servers. But if you don't, at least be familiar with the way clustering works and the two clusterning configurations Active/Active and Active/Passive. SQL
Server books online has enough information on this topic and there is a good white paper available on Microsoft site.
Explain the architecture of SQL Server
This is a very important question and you better be able to answer it if consider yourself a DBA. SQL Server books online is the best place to read about SQL Server architecture. Read up the chapter dedicated to SQL Server Architecture.
What is lock escalation?
Lock escalation is the process of converting a lot of low level locks (like row locks, page locks) into higher level locks (like table locks). Every lock is a memory structure too many locks would mean,
more memory being occupied by locks. To prevent this from happening, SQL Server escalates the many fine-grain locks to fewer coarse-grain locks. Lock escalation threshold was definable in SQL Server 6.5, but from SQL Server 7.0 onwards it's dynamically managed by SQL Server.
What's the difference between DELETE TABLE and TRUNCATE TABLE commands?
DELETE TABLE is a logged operation, so the deletion of each row gets logged in the transaction log, which makes it slow. TRUNCATE TABLE also deletes all the rows in a table, but it won't log the deletion of
each row, instead it logs the deal location of the data pages of the table, which makes it faster. Of course, TRUNCATE TABLE can be rolled back.
Explain the storage models of OLAP
Check out MOLAP, ROLAP and HOLAP in SQL Server books online for more infomation.
What are the new features introduced in SQL Server 2000 (or the latest release of SQL Server at the time of your interview)? What changed between the previous version of SQL Server and the current version?
This question is generally asked to see how current is your knowledge. Generally there is a section in the beginning of the books online titled "What's New", which has all such information. Of course, reading just that is not enough, you should have tried those things to better answer the questions. Also check out the section titled "Backward Compatibility" in books online which talks about the changes that have taken place in the new version.
What are constraints? Explain different types of constraints.
Constraints enable the RDBMS enforce the integrity of the database automatically, without needing you to create triggers, rule or defaults. Types of constraints: NOT NULL, CHECK, UNIQUE, PRIMARY KEY, FOREIGN KEY For an explanation of these constraints see books online for the pages
titled: "Constraints" and "CREATE TABLE", "ALTER TABLE"
Whar is an index? What are the types of indexes? How many clustered indexes can be created on a table? I create a separate index on each column of a table. what are the advantages and disadvantages of this approach?
Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the data quicker. Indexes are of two types. Clustered indexes and non-clustered indexes. When you craete a clustered index on a table, all the rows in the table are stored in the order of the clustered index key. So, there can be only one clustered index per table. Non-clustered indexes have their own storage separate from the table data storage. Non-clustered indexes are stored as B-tree structures (so do clustered indexes), with the leaf level nodes having the index key and it's row locater. The row located could be the RID or the Clustered index key, depending up on the absence or presence of clustered index on the table. If you create an index on each column of a table, it improves the query performance, as the query optimizer can choose from all the existing indexes to come up with an efficient execution plan. At the same t ime, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.
Database administration
What is RAID and what are different types of RAID configurations?
RAID stands for Redundant Array of Inexpensive Disks, used to provide fault tolerance to database servers. There are six RAID levels 0 through 5 offering different levels of performance, fault tolerance.
MSDN has some information about RAID levels and for detailed information, check out the RAID advisory board's homepage
What are the steps you will take to improve performance of a poor performing query?
This is a very open ended question and there could be a lot of reasons behind the poor performance of a query. But some general issues that you could talk about would be: No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization, excess usage of cursors and temporary tables.
Some of the tools/ways that help you troubleshooting performance problems are: SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SETSTATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical execution plan in Query Analyzer.
What are the steps you will take, if you are tasked with securing an SQL Server?
Again this is another open ended question. Here are some things you could talk about: Preferring NT authentication, using server, databse and application roles to control access to the data, securing the
physical database files using NTFS permissions, using an unguessable SA password, restricting physical access to the SQL Server, renaming the Administrator account on the SQL Server computer, disabling the
Guest account, enabling auditing, using multiprotocol encryption, setting up SSL, setting up firewalls, isolating SQL Server from the web server etc.
What is a deadlock and what is a live lock? How will you go about resolving deadlocks?
Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a lock on the other's piece. Each process would wait indefinitely for the other to release the lock, unless one of the user processes is terminated. SQL Server detects deadlocks and terminates one user's process.
A livelock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and
refuses further shared locks. A livelock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely.
Check out SET DEADLOCK_PRIORITY and "Minimizing Deadlocks" in SQL Server books online. Also check out the article Q169960 from Microsoft knowledge base.
What is blocking and how would you troubleshoot it?
Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type. This forces the second connection to wait, blocked on the first.
Explain CREATE DATABASE syntax
Many of us are used to craeting databases from the Enterprise Manager or by just issuing the command: CREATE DATABAE MyDB. But what if you have to create a database with two filegroups, one on drive C and the other on drive D with log on drive E with an initial size of 600 MB and with a growth factor of 15%? That's why being a DBA you should be familiar with the CREATE DATABASE syntax.
How to restart SQL Server in single user mode? How to start SQL Server in minimal configuration mode?
SQL Server can be started from command line, using the SQLSERVR.EXE. This EXE has some very important parameters with which a DBA should be familiar with. -m is used for starting SQL Server in single user mode and -f is used to start the SQL Server in minimal confuguration mode. Check out SQL Server books online for more parameters and their explanations.
As a part of your job, what are the DBCC commands that you commonly use for database maintenance?
DBCC CHECKDB, DBCC CHECKTABLE, DBCC CHECKCATALOG, DBCC CHECKALLOC, DBCC SHOWCONTIG, DBCC SHRINKDATABASE, DBCC SHRINKFILE etc. But there are a whole load of DBCC commands which are very useful for DBAs. Check out SQL Server books online for more information.
What are statistics, under what circumstances they go out of date, how do you update them?
Statistics determine the selectivity of the indexes. If an indexed column has unique values then the selectivity of that index is more, as opposed to an index with non-unique values. Query optimizer uses
these indexes in determining whether to choose an index or not while executing a query.
Some situations under which you should update statistics:
1) If there is significant change in the key values in the index
2) If a large amount of data in an indexed column has been added, changed, or removed (that is, if the distribution of key values has changed), or the table has been truncated using the TRUNCATE TABLE statement and then repopulated
3) Database is upgraded from a previous version
Look up SQL Server books online for the following commands: UPDATE STATISTICS, STATS_DATE, DBCC SHOW_STATISTICS, CREATE STATISTICS, DROP STATISTICS, sp_autostats, sp_createstats, sp_updatestats
What are the different ways of moving data/databases between servers and databases in SQL Server?
There are lots of options available, you have to choose your option depending upon your requirements. Some of the options you have are: BACKUP/RESTORE, dettaching and attaching databases, replication, DTS, BCP, logshipping, INSERT...SELECT, SELECT...INTO, creating INSERT scripts to generate data.
Explian different types of BACKUPs avaialabe in SQL Server? Given a particular scenario, how would you go about choosing a backup plan?
Types of backups you can create in SQL Sever 7.0+ are Full database backup, differential database backup, transaction log backup, filegroup backup. Check out the BACKUP and RESTORE commands in SQL
Server books online. Be prepared to write the commands in your interview. Books online also has information on detailed backup/restore architecture and when one should go for a particular kind of backup.
What is database replicaion? What are the different types of replication you can set up in SQL Server?
Replication is the process of copying/moving data between databases on the same or different servers. SQL Server supports the following types of replication scenarios:
* Snapshot replication
* Transactional replication (with immediate updating subscribers, with queued updating subscribers)
* Merge replication
See SQL Server books online for indepth coverage on replication. Be prepared to explain how different replication agents function, what are the main system tables used in replication etc.
How to determine the service pack currently installed on SQL Server?
The global variable @@Version stores the build number of the sqlservr.exe, which is used to determine the service pack installed. To know more about this process visit SQL Server service packs and versions.
Database programming
What are cursors? Explain different types of cursors. What are the disadvantages of cursors? How can you avoid cursors?
Cursors allow row-by-row prcessing of the resultsets.Types of cursors: Static, Dynamic, Forward-only, Keyset-driven.
Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip, where as a normal SELECT query makes only one rowundtrip, however large the resultset is. Cursors are
also costly because they require more resources and temporary storage (results in more IO operations). Furthere, there are restrictions on the SELECT statements that can be used with some types of cursors. Most of the times, set based operations can be used instead of cursors. Here is an example:
If you have to give a flat hike to your employees using the following
criteria:
Salary between 30000 and 40000 -- 5000 hike
Salary between 40000 and 55000 -- 7000 hike
Salary between 55000 and 65000 -- 9000 hike
In this situation many developers tend to use a cursor, determine each employee's salary and update his salary according to the above formula. But the same can be achieved by multiple update statements or
can be combined in a single UPDATE statement as shown below:
UPDATE tbl_emp SET salary = CASE WHEN salary BETWEEN 30000 AND 40000 THEN salary + 5000
WHEN salary BETWEEN 40000 AND 55000 THEN salary + 7000 WHEN salary BETWEEN 55000 AND 65000 THEN salary + 10000 END
Another situation in which developers tend to use cursors: You need to call a stored procedure when a column in a particular row meets certain condition. You don't have to use cursors for this. This can be
achieved using WHILE loop, as long as there is a unique key to identify each row. For examples of using WHILE loop for row by row processing, check out the 'My code library' section of my site or
search for WHILE.
Write down the general syntax for a SELECT statements covering all the options.
Here's the basic syntax: (Also checkout SELECT in books online for advanced syntax).
SELECT select_list [INTO new_table_] FROM table_source [WHERE search_condition] [GROUP BY group_by__expression] [HAVING search_condition] [ORDER BY order__expression [ASC | DESC] ]
What is a join and explain different types of joins.
Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table.
Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.
Can you have a nested transaction?
Yes, very much. Check out BEGIN TRAN, COMMIT, ROLLBACK, SAVE TRAN and @@TRANCOUNT
What is an extended stored procedure? Can you instantiate a COM object by using T-SQL?
An extended stored procedure is a function within a DLL (written in a programming language like C, C++ using Open Data Services (ODS) API) that can be called from T-SQL, just the way we call normal stored
procedures using the EXEC statement. See books online to learn how to create extended stored procedures and how to add them to SQL Server.
Yes, you can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using sp_OACreate stored procedure. For an example of creating a COM object in VB and
calling it from T-SQL, see 'My code library' section of this site.
What is the system function to get the current user's user id?
USER_ID(). Also check out other system functions like USER_NAME(), SYSTEM_USER, SESSION_USER, CURRENT_USER, USER, SUSER_SID(), HOST_NAME().
What are triggers? How many triggers you can have on a table? How to invoke a trigger on demand?
Triggers are special kind of stored procedures that get executed automatically when an INSERT, UPDATE or DELETE operation takes place on a table.
In SQL Server 6.5 you could define only 3 triggers per table, one for INSERT, one for UPDATE and one for DELETE. From SQL Server 7.0 onwards, this restriction is gone, and you could create multiple triggers per each action. But in 7.0 there's no way to control the order in which the triggers fire. In SQL Server 2000 you could specify which trigger fires first or fires last using sp_settriggerorder
Triggers can't be invoked on demand. They get triggered only when an associated action (INSERT, UPDATE, DELETE) happens on the table on which they are defined. Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the referential integrity checks,
but wherever possible, use constraints for this purpose, instead of triggers, as constraints are much faster. Till SQL Server 7.0, triggers fire only after the data modification operation happens. So in a way, they are called post triggers. But in SQL Server 2000 you could create pre triggers also.
There is a trigger defined for INSERT operations on a table, in an OLTP system. The trigger is written to instantiate a COM object and pass the newly insterted rows to it for some custom processing. What do you think of this implementation? Can this be implemented better?
Instantiating COM objects is a time consuming process and since you are doing it from within a trigger, it slows down the data insertion process. Same is the case with sending emails from triggers. This scenario can be better implemented by logging all the necessary data into a separate table, and have a job which periodically checks this table and does the needful.
What is a self join? Explain it with an example.
Self join is just like any other join, except that two instances of the same table will be joined in the query. Here is an example: Employees table which contains rows for normal employees as well as managers. So, to find out the managers of all the employees, you need a self join.
1. What is a Stored Procedure? - Its nothing but a set of T-SQL statements combined to perform a single task of several tasks. Its basically like a Macro so when you invoke the Stored procedure, you actually run a set of statements.
2. Can you give an example of Stored Procedure? - sp_helpdb , sp_who2, sp_renamedb are a set of system defined stored procedures. We can also have user defined stored procedures which can be called in similar way.
3. What is a view? - If we have several tables in a db and we want to view only specific columns from specific tables we can go for views. It would also suffice the needs of security some times allowing specfic users to see only specific columns based on the permission that we can configure on the view. Views also reduce the effort that is required for writing queries to access specific columns every time.
4. What is an Index? - When queries are run against a db, an index on that db basically helps in the way the data is sorted to process the query for faster and data retrievals are much faster when we have an index.
5. What are the types of indexes available with SQL Server? - There are basically two types of indexes that we use with the SQL Server. Clustered and the Non-Clustered.
6. What is the basic difference between clustered and a non-clustered index? - The difference is that, Clustered index is unique for any given table and we can have only one clustered index on a table. The leaf level of a clustered index is the actual data and the data is resorted in case of clustered index. Whereas in case of non-clustered index the leaf level is actually a pointer to the data in rows so we can have as many non-clustered indexes as we can on the db.
7. What are cursors? - Well cursors help us to do an operation on a set of data that we retreive by commands such as Select columns from table. For example : If we have duplicate records in a table we can remove it by declaring a cursor which would check the records during retreival one by one and remove rows which have duplicate values.
8. When do we use the UPDATE_STATISTICS command? - This command is basically used when we do a large processing of data. If we do a large amount of deletions any modification or Bulk Copy into the tables, we need to basically update the indexes to take these changes into account. UPDATE_STATISTICS updates the indexes on these tables accordingly.
9. Which TCP/IP port does SQL Server run on? - SQL Server runs on port 1433 but we can also change it for better security.
10. From where can you change the default port? - From the Network Utility TCP/IP properties –> Port number.both on client and the server.
11. Can you tell me the difference between DELETE & TRUNCATE commands? - Delete command removes the rows from a table based on the condition that we provide with a WHERE clause. Truncate will actually remove all the rows from a table and there will be no data in the table after we run the truncate command.
12. Can we use Truncate command on a table which is referenced by FOREIGN KEY? - No. We cannot use Truncate command on a table with Foreign Key because of referential integrity.
13. What is the use of DBCC commands? - DBCC stands for database consistency checker. We use these commands to check the consistency of the databases, i.e., maintenance, validation task and status checks.
14. Can you give me some DBCC command options?(Database consistency check) - DBCC CHECKDB - Ensures that tables in the db and the indexes are correctly linked.and DBCC CHECKALLOC - To check that all pages in a db are correctly allocated. DBCC SQLPERF - It gives report on current usage of transaction log in percentage. DBCC CHECKFILEGROUP - Checks all tables file group for any damage.
15. What command do we use to rename a db? - sp_renamedb 'oldname' , 'newname'
16. Well sometimes sp_reanmedb may not work you know because if some one is using the db it will not accept this command so what do you think you can do in such cases? - In such cases we can first bring to db to single user using sp_dboptions and then we can rename that db and then we can rerun the sp_dboptions command to remove the single user mode.
17. What is the difference between a HAVING CLAUSE and a WHERE CLAUSE? - Having Clause is basically used only with the GROUP BY function in a query. WHERE Clause is applied to each row before they are part of the GROUP BY function in a query.
18. What do you mean by COLLATION? - Collation is basically the sort order. There are three types of sort order Dictionary case sensitive, Dictonary - case insensitive and Binary.
19. What is a Join in SQL Server? - Join actually puts data from two or more tables into a single result set.
20. Can you explain the types of Joins that we can have with Sql Server? - There are three types of joins: Inner Join, Outer Join, Cross Join
21. When do you use SQL Profiler? - SQL Profiler utility allows us to basically track connections to the SQL Server and also determine activities such as which SQL Scripts are running, failed jobs etc..
22. What is a Linked Server? - Linked Servers is a concept in SQL Server by which we can add other SQL Server to a Group and query both the SQL Server dbs using T-SQL Statements.
23. Can you link only other SQL Servers or any database servers such as Oracle? - We can link any server provided we have the OLE-DB provider from Microsoft to allow a link. For Oracle we have a OLE-DB provider for oracle that microsoft provides to add it as a linked server to the sql server group.
24. Which stored procedure will you be running to add a linked server? - sp_addlinkedserver, sp_addlinkedsrvlogin
25. What are the OS services that the SQL Server installation adds? - MS SQL SERVER SERVICE, SQL AGENT SERVICE, DTC (Distribution transac co-ordinator)
26. Can you explain the role of each service? - SQL SERVER - is for running the databases SQL AGENT - is for automation such as Jobs, DB Maintanance, Backups DTC - Is for linking and connecting to other SQL Servers
27. How do you troubleshoot SQL Server if its running very slow? - First check the processor and memory usage to see that processor is not above 80% utilization and memory not above 40-45% utilization then check the disk utilization using Performance Monitor, Secondly, use SQL Profiler to check for the users and current SQL activities and jobs running which might be a problem. Third would be to run UPDATE_STATISTICS command to update the indexes
28. Lets say due to N/W or Security issues client is not able to connect to server or vice versa. How do you troubleshoot? - First I will look to ensure that port settings are proper on server and client Network utility for connections. ODBC is properly configured at client end for connection ——Makepipe & readpipe are utilities to check for connection. Makepipe is run on Server and readpipe on client to check for any connection issues.
29. What are the authentication modes in SQL Server? - Windows mode and mixed mode (SQL & Windows).
30. Where do you think the users names and passwords will be stored in sql server? - They get stored in master db in the sysxlogins table.
31. What is log shipping? Can we do logshipping with SQL Server 7.0 - Logshipping is a new feature of SQL Server 2000. We should have two SQL Server - Enterprise Editions. From Enterprise Manager we can configure the logshipping. In logshipping the transactional log file from one server is automatically updated into the backup database on the other server. If one server fails, the other server will have the same db and we can use this as the DR (disaster recovery) plan.
32. Let us say the SQL Server crashed and you are rebuilding the databases including the master database what procedure to you follow? - For restoring the master db we have to stop the SQL Server first and then from command line we can type SQLSERVER –m which will basically bring it into the maintenance mode after which we can restore the master db.
33. Let us say master db itself has no backup. Now you have to rebuild the db so what kind of action do you take? - (I am not sure- but I think we have a command to do it).
34. What is BCP? When do we use it? - BulkCopy is a tool used to copy huge amount of data from tables and views. But it won't copy the structures of the same.
35. What should we do to copy the tables, schema and views from one SQL Server to another? - We have to write some DTS packages for it.
What is Extended Stored Procedure in SQL Server 20...
The Microsoft Extended Stored Procedure API provides a server-based application programming interface (API) for extending Microsoft SQL Server functionality. The API consists of C and C++ functions and macros used to build applications in the following categories: extended stored procedures and gateway applications.
Extended stored procedures allow you to create your own external routines in a programming language such as C. The extended stored procedures appear to users as typical stored procedures and are executed in the same way. Parameters can be passed to extended stored procedures, and they can return results and return status.
Write a SQL Query to find first Week Day of month?
SELECT DATENAME(dw, DATEADD(dd, - DATEPART(dd, GETDATE()) + 1, GETDATE())) AS FirstDay
What is sorting and what is the difference between sorting and clustered indexes?
The ORDER BY clause sorts query results by one or more columns up to 8,060 bytes. This will happen by the time when we retrieve data from database. Clustered indexes physically sorting data, while inserting/updating the table
What are the differences between UNION and JOINS?
A join selects columns from 2 or more tables. A union selects rows.
What is the Referential Integrity?
Referential integrity refers to the consistency that must be maintained between primary and foreign keys, i.e. every foreign key value must have a corresponding primary key value
What is the row size in SQL Server 2000? 8060 bytes.
How to determine the service pack currently installed on SQL Server?
The global variable @@Version stores the build number of the sqlservr.exe, which is used to determine the service pack installed. eg: Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002 14:22:05 Copyright (c) 1988-2003 Microsoft Corporation Enterprise Edition on Windows NT 5.0 (Build 2195: Service Pack 3)
What is the purpose of UPDATE STATISTICS?
Updates information about the distribution of key values for one or more statistics groups (collections) in the specified table or indexed view.
What is the use of SCOPE_IDENTITY() function?
Returns the most recently created identity value for the tables in the current execution scope.
What are the different ways of moving data/databases between servers and databases in SQL Server?
There are lots of options available, you have to choose your option depending upon your requirements. Some of the options you have are: BACKUP/RESTORE, detaching and attaching databases, replication, DTS, BCP, logshipping, INSERT...SELECT, SELECT...INTO, creating INSERT scripts to generate data.
How do you transfer data from text file to database (other than DTS)?
Using the BCP (Bulk Copy Program) utility.
What's the difference between DELETE TABLE and TRUNCATE TABLE commands?
DELETE TABLE is a logged operation, so the deletion of each row gets logged in the transaction log, which makes it slow. TRUNCATE TABLE also deletes all the rows in a table, but it won't log the deletion of each row, instead it logs the deallocation of the data pages of the table, which makes it faster. Of course, TRUNCATE TABLE can be rolled back.
What is a deadlock?
Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a lock on the other's piece. Each process would wait indefinitely for the other to release the lock, unless one of the user processes is terminated. SQL Server detects deadlocks and terminates one user's process.
What is a LiveLock?
A livelock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared locks. A livelock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely.
How to restart SQL Server in single user mode?
From Startup Options :- Go to SQL Server Properties by right-clicking on the Server name in the Enterprise manager. Under the 'General' tab, click on 'Startup Parameters'. Enter a value of -m in the Parameter.
Does SQL Server 2000 clustering support load balancing?.
SQL Server 2000 clustering does not provide load balancing; it provides failover support. To achieve load balancing, you need software that balances the load between clusters, not between servers within a cluster.
What is DTC?
The Microsoft Distributed Transaction Coordinator (MS DTC) is a transaction manager that allows client applications to include several different sources of data in one transaction. MS DTC coordinates committing the distributed transaction across all the servers enlisted in the transaction.
What is DTS?
Microsoft® SQL Server™ 2000 Data Transformation Services (DTS) is a set of graphical tools and programmable objects that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations.
What are defaults? Is there a column to which a default can't be bound?
A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns and timestamp columns can't have defaults bound to them.
What are the constraints ?
Table Constraints define rules regarding the values allowed in columns and are the standard mechanism for enforcing integrity. SQL Server 2000 supports five classes of constraints. NOT NULL , CHECK, UNIQUE, PRIMARY KEY, FOREIGN KEY.
What is Transaction?
A transaction is a sequence of operations performed as a single logical unit of work. A logical unit of work must exhibit four properties, called the ACID (Atomicity, Consistency, Isolation, and Durability) properties, to qualify as a transaction.
What is Isolation Level?
An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level is Read Committed. A lower isolation level increases concurrency, but at the expense of data correctness. Conversely, a higher isolation level ensures that data is correct, but can affect concurrency negatively. The isolation level required by an application determines the locking behavior SQL Server uses. SQL-92 defines the following isolation levels, all of which are supported by SQL Server: