This post has already been read 900 times!
In this article (Italian), we have already discussed Continuous Integration and DLM. We described the main reasons for putting a database under source control and for testing our databases. In this article, we are going to discuss a real scenario of SQL Server unit testing with the tSQLt framework.
What Is Unit Testing
In computer programming, unit testing is a software testing method by which individual units are tested to determine if they are fit for use. With “units”, we mean portions of source code, sets of one or more modules together with the associated control data, usage, and operating procedures. The primary purpose of this approach is to find bugs and prevent regressions (Source: Wikipedia).
When speaking about programming, tests are a particular type of method (with attributes or naming conventions) that run “real” methods in order to verify the real methods' expected behaviors. Supposing to have a method called Power(x, y), a test could be a class method called Power_should_return_eight_when_x_is_two_and_y_is_three() that runs the Power(2, 3) method and expects 8 as the results. On the database side, a test could be a stored procedure called Power_should_return_eight_when_x_is_two_and_y_is_three that runs the udf_Power(x, y) function.
Why Unit Testing?
There are many benefits to performing tests. As described in the definition, unit testing is a practice by which we can prevent regressions and find bugs, so, we can fix these bugs as soon we are alerted they exist.
In my experience, our development team has raised the quality of our code (tests and code work together to achieve better code). Additionally, we have migrated from a serial approach to an evolutionary development one. We used to identify all the requirements before starting to implement features in order to get a detailed database design (which supports “every” case). We would then start development and finally deploy our application (serial). Now we are changing our code in a series of little releases rather than one big one (iterative, evolutionary). Thus, our code has become more flexible.
Executing tests allowed us to have a reliable deployment process, because many problems were caught during the test phase. Our customers experienced many great advantages using this technique. Additionally, creating tests has helped the code self-documentation, because we can understand better what a method is supposed to do in a functional point of view.
I would like to discuss two different real scenarios in which I have found unit testing strong and useful. I worked as a consultant in a company that implemented ERP solutions. One of my jobs was to fix problems found by the dev team during the development of a search feature in the application. A stored procedure was often the troublemaker. First, I'll describe how we started to put the procedure under unit tests; then I'll examine how we found the unit-testing process useful when refactoring the database.
First Scenario: Put a Stored Procedure Under a Set of Unit Tests
The stored procedure called SearchUsers searches for users in a table using multiple filters. The developers changed that stored procedure many times. As the customer noted, the changes often caused an unpredictable break in the application.
The stakeholder's request was to be able to filter the results by different columns and with different data types, such as user id, last name that “starts with”, etc. All these filters must be mutually exclusive and contained in a single stored procedure (customer design requirement). The procedure should return also additional details about the users if available, with this data stored in a different table. If the data is not available, the resultset must return at least the user table details.
We have different behaviors, but the resultset will have the same columns. Additionally, there is a JOIN condition, which allows us to get the additional fields.
At the time I first opened the stored procedure, the implementation was managed in an incoherent way. Sometimes the resultset was missing, sometimes the resultset was empty when it should return at least one row, and so on. The application layer was not aware of these outputs, so the call outcome was totally out of control. We are going to use the following model, which is similar to the customer one:
-- dbo.Users table CREATE TABLE dbo.Users ( UserId int IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED , Username varchar(30) NOT NULL , FirstName varchar(30) NOT NULL , LastName varchar(30) NOT NULL , BirthDate date NOT NULL ); GO -- dbo.UserContacts table CREATE TABLE dbo.UserContacts ( UserId int NOT NULL PRIMARY KEY CLUSTERED , Street varchar(50) NOT NULL , Town varchar(30) NOT NULL , ZipCode char(5) NOT NULL , StateOrProvince varchar(30) NOT NULL ); GO -- FKs ALTER TABLE dbo.UserContacts ADD CONSTRAINT FK_Users_UserContacts FOREIGN KEY (UserId) REFERENCES dbo.Users (UserId); GO -- customer procedure CREATE PROCEDURE dbo.SearchUsers @UserId int = NULL , @LastName varchar(30) = NULL AS BEGIN SET NOCOUNT ON; IF @UserId IS NOT NULL OR @LastName IS NOT NULL BEGIN SELECT U.UserId , U.Username , U.FirstName , U.LastName , U.BirthDate , UC.Street , UC.Town , UC.ZipCode , UC.StateOrProvince FROM dbo.Users U JOIN dbo.UserContacts UC ON UC.UserId = U.UserId WHERE @UserId IS NULL OR U.UserId = @UserId AND (@LastName IS NULL OR U.LastName LIKE @LastName + '%'); END; END; GO
What We Want to Test
In our unit tests, we want to test different searches results based on the parameters passed into the SearchUsers procedure. We will expect the following results:
|Evaluated parameter||Expected result|
|User id||Resultset with one record|
|User id with missing contacts||Resultset with one record|
|No given parameters||Resultset with zero records|
We are going to use tSQLt as testing framework for creating the tests.
The first test will check for the number of rows returned – one – based on an existing user id. The second test will check for the number of rows returned – one – based on existing user id, with missing contacts data. Then, the last test will verify whether an empty resultset will be returned when all the parameters are set to NULL.
Before implementing unit tests, we should create a test class that keeps tests under the same logical set. We can create a tSQLt test class using the tSQLt.NewTestClass procedure:
EXEC tSQLt.NewTestClass @ClassName = N'UserTests';
A new database schema called “UserTests” will be created and a new extended property will be attached to it, as depicted in the following picture:
The property tSQLt.TestClass indicates that “UserTests” is a test class.
Test – Call with User ID
Here is the test for the first case:
CREATE PROCEDURE UserTests.[test SearchUsers should return one row when passing an existing UserId] AS BEGIN -- Assemble (create test data) -- 1 - fake the Users and UserContacts table -- (isolates the tables copying them in two new -- tables with the name of the original ones) EXEC tSQLt.FakeTable @TableName = N'Users' , @SchemaName = N'dbo' , @Identity = 0; -- does not retain identities on the fake table (test purpose) EXEC tSQLt.FakeTable @TableName = N'UserContacts' , @SchemaName = N'dbo'; -- 2 - insert "stub" records INSERT INTO dbo.Users (UserId, Username, FirstName, LastName, BirthDate) VALUES (1, 'suxstellino', 'Alessandro', 'Alpi', '19810422'), (2, 'ettagab', 'Gabriele', 'Etta', '19940725'); INSERT INTO dbo.UserContacts (UserId, Street, Town, ZipCode, StateOrProvince) VALUES (1, 'Allen street', 'New York', '12345', 'NY'), (2, 'Madison avenue', 'New York', '12345', 'NY'); -- Act -- create a result table that will contain resultset DECLARE @Results table ( UserId int , Username varchar(30) , FirstName varchar(30) , LastName varchar(30) , BirthDate date , Street varchar(50) , Town varchar(30) , ZipCode char(5) , StateOrProvince varchar(30) ) -- execute the customer procedure and store the result into the @Results table INSERT INTO @Results (UserId, Username, FirstName, LastName, BirthDate, Street, Town, ZipCode, StateOrProvince) EXEC dbo.SearchUsers @UserId = 1; -- store the number of record into a temp variable DECLARE @NumberOfRows int = -1; SELECT @NumberOfRows = COUNT(1) FROM @Results; -- Assert EXEC tSQLt.AssertEquals @Expected = 1 , @Actual = @NumberOfRows , @Message = N'Wrong number of rows!'; END; GO
The name of the stored procedure starts with “test”. This is a tSQLt naming convention for test names. As you can see, the test starts with an assemble section. In this portion of code, we create fake tables (copy of the original one, with the same name) in order to isolate the test objects. Then, we create record inside the new tables in order to execute the SearchUsers procedure (act section).
The resultset returned by the customer stored procedure is stored on a table variable and the number of rows is computed using the COUNT function on that variable (@NumberOfRows). The last section is the assert one. We execute the tSQLt.AssertEquals procedure in order to verify the values of the @NumberOfRows variable compared to the expected value (1).
Executing the test using the tSQLt.Run procedure will show the following result:
This test was executed successfully also on the customers’ database.
Test – Call with User ID and User with Missing Contacts
The second test is similar to the first one, but now the contacts are missing for the tested user. Here is the stored procedure:
CREATE PROCEDURE UserTests.[test SearchUsers should return one row when passing an existing UserId and contacts are missing] AS BEGIN -- Assemble (create test data) EXEC tSQLt.FakeTable @TableName = N'Users' , @SchemaName = N'dbo' , @Identity = 0; -- does not retain identities on the fake table (test purpose) EXEC tSQLt.FakeTable @TableName = N'UserContacts' , @SchemaName = N'dbo'; -- 2 - insert "stub" records INSERT INTO dbo.Users (UserId, Username, FirstName, LastName, BirthDate) VALUES (2, 'ettagab', 'Gabriele', 'Etta', '19940725'); -- Act DECLARE @Results table ( UserId int , Username varchar(30) , FirstName varchar(30) , LastName varchar(30) , BirthDate date , Street varchar(50) , Town varchar(30) , ZipCode char(5) , StateOrProvince varchar(30) ) -- execute the customer procedure and store the result into the @Results table INSERT INTO @Results (UserId, Username, FirstName, LastName, BirthDate, Street, Town, ZipCode, StateOrProvince) EXEC dbo.SearchUsers @UserId = 2; -- store the number of record into a temp variable DECLARE @NumberOfRows int = -1; SELECT @NumberOfRows = COUNT(1) FROM @Results; -- Assert EXEC tSQLt.AssertEquals @Expected = 1 , @Actual = @NumberOfRows , @Message = N'Wrong number of rows!'; END; GO
We did not insert any record on the UserContacts table. The test fails, because in the customer stored procedure a JOIN is used instead of a LEFT JOIN. The result is:
After the test fail, the development team fixed the bug with a LEFT JOIN operator.
Test – No Given Parameters
Here is the test implementation:
CREATE PROCEDURE UserTests.[test SearchUsers should return one row when passing null parameters] AS BEGIN -- Assemble (create test data) EXEC tSQLt.FakeTable @TableName = N'Users' , @SchemaName = N'dbo' , @Identity = 0; -- does not retain identities on the fake table (test purpose) EXEC tSQLt.FakeTable @TableName = N'UserContacts' , @SchemaName = N'dbo'; -- Act -- Assert EXEC tSQLt.AssertResultSetsHaveSameMetaData @expectedCommand = N'EXEC dbo.SearchUsers;' , @actualCommand = N'SET FMTONLY ON; SELECT U.UserId , U.Username , U.FirstName , U.LastName , U.BirthDate , C.Street , C.Town , C.ZipCode , C.StateOrProvince FROM dbo.Users U JOIN dbo.UserContacts C ON U.UserId = C.UserId;'; END; GO
As we can see, there are differences between this test and the previous ones. We are just faking tables and executing the tSQLt.AssertResultSetsHaveSameMetaData, which verifies whether two commands return the same set of fields.
This test failed due to the IF condition used on the top of the SearchUsers stored procedure. The following message is returned:
The metadata cannot be tested because the SearchUsers stored procedure does not return any resultset. In this case, the development team fixed the issue moving the condition just before the database call, into the application layer.
Second Scenario: Database Refactoring
The same development team found useful the tests we have created together. After a few weeks, a database refactor has been made. The DBA team changed the Users table, moving the Username column to a new table. In this case, when the development team download the changes from the source control, the set of tests were executed.
The DBA team executed the following statements:
CREATE TABLE dbo.UserCredentials ( UserId int NOT NULL PRIMARY KEY CLUSTERED , Username varchar(30) NOT NULL , Pwd varchar(30) NOT NULL , MustChangePassword bit NOT NULL DEFAULT(0) ); GO -- [moving data] ALTER TABLE dbo.Users DROP COLUMN Username; GO
Yet another test execution
As we can see, after the data was moved to the new table, a drop column occurred. Developers tried to execute again the tests and they all return error. Error is not a test failure. It is an exception. The set of test was executed running the tSQLt.RunTestClass procedure. This procedure runs all the test in a given test class:
The unit tests in our sample did not work. The “Username” column is missing in every tests. This alert helped the development team to apply the changes to the SearchUsers stored procedure, before going on with the release process. This prevented a big regression bug and the consequent application break.
Unit testing is one of the most important practice to follow when we want to avoid regression in our releases. Evolutionary database design works very well with unit testing, because a stored procedure, written in a testable manner, supports small and iterative changes. Tests saves many hours when making changes since they can suggest the changes to make.
Now, we have tools and it is not mandatory to create a self-made testing architecture. We can use both t-sql or GUI based tools (like SQL Test by Red-Gate) and we can automate test processes in a Continuous Integration environment.