Showing posts with label select. Show all posts
Showing posts with label select. Show all posts

Tuesday, May 7, 2013

Non-SARGable Queries

As I've been doing some research online, I've come across a term that I am not familiar with.  This term is SARGable and is used in reference to the WHERE clause of select statements.  SARG actually stands for Search ARGument able.  Basically, if you have an index or multiple indexes on a table, a SARGable WHERE clause can take advantage of your indexes.  On the other hand, a non-SARGable WHERE clause cannot take full advantage of the indexes and will cause your query to take longer to run.

Keep in mind that this only applies to WHERE clauses.  If these statement are used in your SELECT statement, it will not affect performance.

So, what makes a query non-SARGable?  Here are some examples:
  • Using LIKE with a leading %
    • SELECT...... WHERE LastName like '%mith'
  • Using a field within a function in the WHERE clause
    • SELECT...... WHERE RIGHT(LastName, 5) like 'mith'
Each of these queries skips the index and uses a table scan, causing your query to run longer and resources to be tied up.  So what can we do instead?
  • Use =, >, <, >=, <=, BETWEEN, LIKE without a leading %
    • SELECT....... WHERE LastName like 'Smith%'
  • Use IN, OR, NOT IN, NOT EXISTS, NOT LIKE
    • SELECT....... WHERE LastName not in ('Smith')
Using these clues can increase your performance and should be used at all times when possible.

Wikipedia on "SARGable"

Monday, May 6, 2013

REPLACE Function

Let's say that you have a customer database and you need to update all of your customers' addresses.  You need to change every instance of "Street" to "St."  How would you go about easily doing this using a T-SQL query?

Microsoft SQL Server has a nifty function called REPLACE().  Following is the basic syntax for the REPLACE function:

REPLACE ( string_expression , string_pattern , string_replacement )

In our example above, we would put the field of our address in the "string_expression", the value "Street" in the "string_pattern" and "St." in the "string_replacement:.  You would have a query much like this:

update customer
set address1 = REPLACE(address1, 'Street', 'St.')
where address1 like '%Street%'

You can also use REPLACE() to reference a value in one field to update another field.  For example, instead of using strings in "string_pattern" and "string_replacement", you can use fields in your table.  If the customer table contains both a phone number and an extension for your customer, you could use this to update the customer's call records.  A query would look something like this:

update RatedCall
set CallingNumber = REPLACE(CallingNumber, ANI, Customer.Extension)
from RatedCall, Customer
where RatedCall.CustomerNumber = Customer.CustomerNumber

Where ANI is the customer's phone number and extension is their extension (CallingNumber is the ANI is the call record).  This would only update instances where CallingNumber is the customer's phone number.

In what cases at your organization could you see using the REPLACE() function?

To read up more on SQL REPLACE(), go here: SQL REPLACE()

Thursday, May 2, 2013

SQL Indexes and Their Benefits

As the data in your database grows, you may notice that your querying becomes slower.  This occurs because SQL Server needs to search through each record until it finds all the records you are looking for.  So, when this happens to your database, how do we approach the problem?

The answer to this lies in something called indexes.  An index is described as a structure on a table or view that speeds up the retrieval of data.  In SQL Server, there are two major types of indexes:
  1. Clustered Index
    • Clustered indexes sort and store the data based on key values (specified by you).
    • Each table can only contain one clustered index.
  2. Nonclustered index
    • Nonclustered indexes do not sort and store the data like clustered indexes do.  They basically have a pointer to the data based on the key values selected.
    • The pointer is actually called a row locator.
    • Depending on your version of SQL Server, you can have quite a few nonclustered indexes.
When performing a search on table, SQL Server will use the least labor intensive method.  Indexes will be the least labor intensive (finds storage location for the data) than a non-indexed table (table scan), which needs to search through each row to find a match and can be resource intensive.

Now, it is very important to put in time considering what should be included in your indexes.  For example, clustered indexes should be fields that have unique data (primary key).  Nonclustered indexes could be created on other columns that you search on frequently and multiple columns that are searched together could be included on the same nonclustered index.  Be careful, though, because adding too many indexes (one of every column) will cause too much overhead.

Don't be afraid to try out different indexes on your data.  Put an index in place, query the table, and see if your querying improves.  If not, try a different index and keep trying until you feel like you have a couple that fit your needs.

Thursday, April 25, 2013

Database Normalization

Recently I was venturing through a forum on one of the SQL sites I regularly visit.  While there, I found someone who was asking a question about how to run a particular query on their database.  At first, I thought the query would have been pretty easy to run, but after a few attempts I found that I was still not getting the results I expected.  I took a step back and looked at the data again.  Finally, I realized my queries weren't working out as expected because their database wasn't normalized.

What is database normalization?  Database normalization is taking your data and organizing it into a more efficient manner.  The main goals of database normalization include:

  • Avoiding redundant data
  • Creating a table for each group of related data
  • Putting subsets of data into a separate table
  • Building relationships between each of the tables
Let's use, as an example, that you own a company that sells bike parts.  Would it be more efficient to store all your invoice data into one table?  Would you store the invoice details (charge totals, taxes, customer address) in the same table as the order details (quantity ordered, quantity shipped, items)?  You certainly could, but the data becomes ugly to try to query.

You would be better off having different tables, such as the following:
  • Customer (contains customer information, such as: account number, address, contact phone numbers, email addresses, etc.)
  • Invoice (contains account number, invoice number, charge total, taxes, shipping address, amount paid, etc.)
  • Invoice Details (contains invoice detail number, invoice number, item number, item description, item quantity, etc.)
  • Item Detail (contains item number, serial number, quantity available, etc.)
You will notice there is no duplicated data and we have what is called a Primary Key, or PK.  The PK is a value that uniquely identifies each row in the table.  This primary can then be used in other tables, where it then is called a Foreign Key (FK).  These keys can then be used to select data across each table.

Before I show you how primary keys are used, let me outline which is a PK or FK in each table.  See below:
  • Customer
    • Account Number (PK)
  • Invoice
    • Invoice Number (PK)
    • Account Number (FK)
  • Invoice Details
    • Invoice Detail Number (PK)
    • Invoice Number (FK)
    • Item Number (FK)
  • Item Detail
    • Item Number (PK)
Let's say we want to get the customer information for a particular invoice number.  We would run the following query:

SELECT
from Customer c
join Invoice i
on c.AccountNumber = i.AccountNumber
where i.InvoiceNumber = 12

Since we have the account number in both the Customer and Invoice tables, we can use it to join the tables together.

For more information on Database Normalization, take a look at the following article:

Wednesday, April 24, 2013

Using CASE Statements in Your Select

Let's say that you have a table of data that you want to group together.  Your data might be one of the following:

  • You want to give a category or title to a list of ranges (using int, numeric, float, etc.)
  • You want to group values that aren't necessarily in the same range (varchar, etc)
  • You want to give more description to a single value or list of values
  • You rely on more than one condition to fall under a particular category
There are a few ways that you could approach grabbing this data.  You could write separate queries, just grabbing the data you want grouped together, and then tying all data together using UNION ALL.  That query would look something like this:

SELECT 'Less than $10' as Category, count(*)
from Sales
where SalesAmount < 10

UNION ALL

SELECT 'Between $10 and $50' as Category, count(*)
from Sales
where SalesAmount between 10 and 50

UNION ALL

SELECT 'Over $50' as Category, count(*)
from Sales
where SalesAmount > 50

This certainly gets the job done, but leaves too much room for error.  Your other choice is to write a CASE statement in your select statement (which would need to be accompanied by using the same CASE statement in a group by).  Accomplishing the same output, you could do something like this:

SELECT Category = CASE
WHEN SalesAmount < 10 then 'Less Than $10'
WHEN SalesAmount between 10 and 50 then 'Between $10 and $50'
WHEN SalesAmount > 50 then 'Over $50'
END,
count(*)
from Sales
group by CASE
WHEN SalesAmount < 10 then 'Less Than $10'
WHEN SalesAmount between 10 and 50 then 'Between $10 and $50'
WHEN SalesAmount > 50 then 'Over $50'

END

You can also use the CASE statement to provide more than one condition:


SELECT Category = CASE
WHEN SalesAmount < 9 or SalesAmount > 99 then 'Small and Big Sales'
WHEN SalesAmount between 10 and 50 then 'Middle Sales'
END,
count(*)
from Sales
group by CASE
WHEN SalesAmount < 9 or SalesAmount > 99 then 'Small and Big Sales'
WHEN SalesAmount between 10 and 50 then 'Middle Sales'
END

Go here to read more on SQL CASE statements:

SQL CASE