Saturday, February 19, 2011

SqlServer Built-in Functions

SQL Server provides many built-in functions that you can use in queries to return data or perform operations on data.
FunctionDescription
Rowset Functions Return an object that can be used like table references in an SQL statement.
Aggregate Functions Operate on a collection of values but return a single, summarizing value.
Ranking Functions Return a ranking value for each row in a partition.
Scalar Functions (Described below)Operate on a single value and then return a single value. Scalar functions can be used wherever an expression is valid.
Function categoryDescription
Configuration Functions Return information about the current configuration.
Cryptographic Functions Support encryption, decryption, digital signing, and the validation of digital signatures.
Cursor Functions Return information about cursors.
Data Type Functions Return information about identity values and other data type values.
Date and Time Data Types and Functions Perform operations on a date and time input values and return string, numeric, or date and time values.
Mathematical Functions Perform calculations based on input values provided as parameters to the functions, and return numeric values.
Metadata Functions Return information about the database and database objects.
ODBC Scalar Functions Return information about scalar ODBC functions in a Transact-SQL statement.
Replication Functions Return information that is used to administer, monitor, and maintain a replication topology
Security Functions Return information about users and roles.
String Functions Perform operations on a string (char or varchar) input value and return a string or numeric value.
System Functions Perform operations and return information about values, objects, and settings in an instance of SQL Server.
System Statistical Functions Return statistical information about the system.
Text and Image Functions Perform operations on text or image input values or columns, and return information about the value.
Trigger Functions Return information about triggers.
SQL Server built-in functions are either deterministic or nondeterministic. Functions are deterministic when they always return the same result any time they are called by using a specific set of input values. Functions are nondeterministic when they could return different results every time they are called, even with the same specific set of input values.

Rowset Functions (Transact-SQL)
The following rowset functions return an object that can be used in place of a table reference in a Transact-SQL statement.

 


Returns a table of zero, one, or more rows for those columns containing character-based data types for precise or fuzzy (less precise) matches to single words and phrases, the proximity of words within a certain distance of one another, or weighted matches. CONTAINSTABLE can only be referenced in the FROM clause of a SELECT statement as if it were a regular table name.
Queries using CONTAINSTABLE specify contains-type full-text queries that return a relevance ranking value (RANK) and full-text key (KEY) for each row. The CONTAINSTABLE function uses the same search conditions as the CONTAINS predicate.
Topic link iconTransact-SQL Syntax Conventions
CONTAINSTABLE (table , { column_name | (column_list ) | * } ,' < contains_search_condition > ' 
     [ , LANGUAGE language_term] 
  [ ,top_n_by_rank ] 
          ) 
< contains_search_condition > ::= 
    { < simple_term > 
    | < prefix_term > 
    | < generation_term > 
    | < proximity_term > 
    |  < weighted_term > 
    } 
    | { ( < contains_search_condition > ) 
    { { AND | & } | { AND NOT | &! } | { OR | | } } 
     < contains_search_condition > [ ...n ] 
    }
< simple_term > ::= 
     word | "phrase "
< prefix term > ::= 
     { "word *" | "phrase *" } 
< generation_term > ::= 
     FORMSOF ( { INFLECTIONAL | THESAURUS } , < simple_term > [ ,...n ] ) 
< proximity_term > ::= 
     { < simple_term > | < prefix_term > } 
     { { NEAR | ~ } { < simple_term > | < prefix_term > } } [ ...n ] 
< weighted_term > ::= 
     ISABOUT
    ( { { 
  < simple_term > 
  | < prefix_term > 
  | < generation_term > 
  | < proximity_term > 
  } 
   [ WEIGHT (weight_value ) ] 
   } [ ,...n ] 
    )

table
Is the name of a table that has been full-text indexed. table can be a one-, two-, three-, or four-part database object name. When querying a view, only one full-text indexed base table can be involved. table cannot specify a server name and cannot be used in queries against linked servers.
column_name
Is the name of one or more columns that are indexed for full-text searching. The columns can be of type char, varchar, nchar, nvarchar, text, ntext, image, xml, varbinary, or varbinary(max).
column_list
Indicates that several columns, separated by a comma, can be specified. column_list must be enclosed in parentheses. Unless language_term is specified, the language of all columns of column_list must be the same.
*
Specifies that all full-text indexed columns in table should be used to search for the given search condition. Unless language_term is specified, the language of all columns of the table must be the same.
LANGUAGE language_term
Is the language whose resources will be used for word breaking, stemming, and thesaurus and noise-word (or stopword) removal as part of the query. This parameter is optional and can be specified as a string, integer, or hexadecimal value corresponding to the locale identifier (LCID) of a language. If language_term is specified, the language it represents will be applied to all elements of the search condition. If no value is specified, the column full-text language is used. If documents of different languages are stored together as binary large objects (BLOBs) in a single column, the locale identifier (LCID) of a given document determines what language is used to index its content. When querying such a column, specifying LANGUAGElanguage_term can increase the probability of a good match. When specified as a string, language_term corresponds to the alias column value in the sys.syslanguages compatibility view. The string must be enclosed in single quotation marks, as in 'language_term'. When specified as an integer, language_term is the actual LCID that identifies the language. When specified as a hexadecimal value, language_term is 0x followed by the hexadecimal value of the LCID. The hexadecimal value must not exceed eight digits, including leading zeros. If the value is in double-byte character set (DBCS) format, Microsoft SQL Server will convert it to Unicode. If the language specified is not valid or there are no resources installed that correspond to that language, SQL Server returns an error. To use the neutral language resources, specify 0x0 as language_term.
top_n_by_rank
Specifies that only the nhighest ranked matches, in descending order, are returned. Applies only when an integer value, n, is specified. If top_n_by_rank is combined with other parameters, the query could return fewer rows than the number of rows that actually match all the predicates. top_n_by_rank allows you to increase query performance by recalling only the most relevant hits.
<contains_search_condition>
Specifies the text to search for in column_name and the conditions for a match. For information about search conditions, see CONTAINS (Transact-SQL).
Full-text predicates and functions work on a single table, which is implied in the FROM predicate. To search on multiple tables, use a joined table in your FROM clause to search on a result set that is the product of two or more tables.
The table returned has a column named KEY that contains full-text key values. Each full-text indexed table has a column whose values are guaranteed to be unique, and the values returned in the KEY column are the full-text key values of the rows that match the selection criteria specified in the contains search condition. The TableFulltextKeyColumn property, obtained from the OBJECTPROPERTYEX function, provides the identity of this unique key column. To obtain the ID of the column associated with the full-text key of the full-text index, use sys.fulltext_indexes. For more information, see sys.fulltext_indexes (Transact-SQL).
To obtain the rows you want from the original table, specify a join with the CONTAINSTABLE rows. The typical form of the FROM clause for a SELECT statement using CONTAINSTABLE is:
SELECT select_list
FROM table AS FT_TBL INNER JOIN
   CONTAINSTABLE(table, column, contains_search_condition) AS KEY_TBL
   ON FT_TBL.unique_key_column = KEY_TBL.[KEY]
The table produced by CONTAINSTABLE includes a column named RANK. The RANK column is a value (from 0 through 1000) for each row indicating how well a row matched the selection criteria. This rank value is typically used in one of these ways in the SELECT statement:
  • In the ORDER BY clause to return the highest-ranking rows as the first rows in the table.
  • In the select list to see the rank value assigned to each row.
CONTAINSTABLE is not recognized as a keyword if the compatibility level is less than 70. For more information, see sp_dbcmptlevel (Transact-SQL).

FREETEXTTABLE (Transact-SQL)

 

Returns a table of zero, one, or more rows for those columns containing character-based data types for values that match the meaning, but not the exact wording, of the text in the specified freetext_string. FREETEXTTABLE can only be referenced in the FROM clause of a SELECT statement like a regular table name.
Queries using FREETEXTTABLE specify freetext-type full-text queries that return a relevance ranking value (RANK) and full-text key (KEY) for each row.
Topic link iconTransact-SQL Syntax Conventions
FREETEXTTABLE (table , { column_name | (column_list) | * } 
          ,'freetext_string' 
     [ , LANGUAGE language_term ] 
     [ ,top_n_by_rank ] )

table
Is the name of the table that has been marked for full-text querying. table or viewcan be a one-, two-, or three-part database object name. When querying a view, only one full-text indexed base table can be involved. table cannot specify a server name and cannot be used in queries against linked servers.
column_name
Is the name of one or more full-text indexed columns of the table specified in the FROM clause. The columns can be of type char, varchar, nchar, nvarchar, text, ntext, image, xml, varbinary, or varbinary(max).
column_list
Indicates that several columns, separated by a comma, can be specified. column_list must be enclosed in parentheses. Unless language_term is specified, the language of all columns of column_list must be the same.
*
Specifies that all columns that have been registered for full-text searching should be used to search for the given freetext_string. Unless language_term is specified, the language of all full-text indexed columns in the table must be the same.
freetext_string
Is text to search for in the column_name. Any text, including words, phrases or sentences, can be entered. Matches are generated if any term or the forms of any term is found in the full-text index. Unlike in the CONTAINS search condition where AND is a keyword, when used in freetext_string the word 'and' is considered a noise word, or stopword, and will be discarded. Use of WEIGHT, FORMSOF, wildcards, NEAR and other syntax is not allowed. freetext_string is wordbroken, stemmed, and passed through the thesaurus. If freetext_string is enclosed in double quotation marks, a phrase match is instead performed; stemming and thesaurus are not performed.
LANGUAGE language_term
Is the language whose resources will be used for word breaking, stemming, and thesaurus and stopword removal as part of the query. This parameter is optional and can be specified as a string, integer, or hexadecimal value corresponding to the locale identifier (LCID) of a language. If language_term is specified, the language it represents will be applied to all elements of the search condition. If no value is specified, the column full-text language is used. If documents of different languages are stored together as binary large objects (BLOBs) in a single column, the locale identifier (LCID) of a given document determines what language is used to index its content. When querying such a column, specifying LANGUAGElanguage_term can increase the probability of a good match. When specified as a string, language_term corresponds to the alias column value in the sys.syslanguages (Transact-SQL) compatibility view. The string must be enclosed in single quotation marks, as in 'language_term'. When specified as an integer, language_term is the actual LCID that identifies the language. When specified as a hexadecimal value, language_term is 0x followed by the hexadecimal value of the LCID. The hexadecimal value must not exceed eight digits, including leading zeros. If the value is in double-byte character set (DBCS) format, Microsoft SQL Server will convert it to Unicode. If the language specified is not valid or there are no resources installed that correspond to that language, SQL Server returns an error. To use the neutral language resources, specify 0x0 as language_term.
top_n_by_rank
Specifies that only the nhighest ranked matches, in descending order, are returned. Applies only when an integer value, n, is specified. If top_n_by_rank is combined with other parameters, the query could return fewer rows than the number of rows that actually match all the predicates. top_n_by_rank allows you to increase query performance by recalling only the most relevant hits.
Full-text predicates and functions work on a single table, which is implied in the FROM predicate. To search on multiple tables, use a joined table in your FROM clause to search on a result set that is the product of two or more tables.
FREETEXTTABLE uses the same search conditions as the FREETEXT predicate.
Like CONTAINSTABLE, the table returned has columns named KEY and RANK, which are referenced within the query to obtain the appropriate rows and use the row ranking values.
FREETEXTTABLE is not recognized as a keyword if the compatibility level is less than 70. For more information, see sp_dbcmptlevel (Transact-SQL).

OPENDATASOURCE (Transact-SQL)

 
 

Provides ad hoc connection information as part of a four-part object name without using a linked server name.
Topic link iconTransact-SQL Syntax Conventions
OPENDATASOURCE ( provider_name,init_string )

provider_name
Is the name registered as the PROGID of the OLE DB provider used to access the data source. provider_name is a char data type, with no default value.
init_string
Is the connection string passed to the IDataInitialize interface of the destination provider. The provider string syntax is based on keyword-value pairs separated by semicolons, such as: 'keyword1=value;keyword2=value'. For specific keyword-value pairs supported on the provider, see the Microsoft Data Access SDK. This documentation defines the basic syntax. The following table lists the most frequently used keywords in the init_string argument.
KeywordOLE DB propertyValid values and description
Data SourceDBPROP_INIT_DATASOURCEName of the data source to connect to. Different providers interpret this in different ways. For SQL Server Native Client OLE DB provider, this indicates the name of the server. For Jet OLE DB provider, this indicates the full path of the .mdb file or .xls file.
LocationDBPROP_INIT_LOCATIONLocation of the database to connect to.
Extended PropertiesDBPROP_INIT_PROVIDERSTRINGThe provider-specific connect-string.
Connect timeoutDBPROP_INIT_TIMEOUTTime-out value after which the connection try fails.
User IDDBPROP_AUTH_USERIDUser ID to be used for the connection.
PasswordDBPROP_AUTH_PASSWORDPassword to be used for the connection.
CatalogDBPROP_INIT_CATALOGThe name of the initial or default catalog when connecting to the data source.
Integrated SecurityDBPROP_AUTH_INTEGRATEDSSPI, to specify Windows Authentication
OPENDATASOURCE can be used to access remote data from OLE DB data sources only when the DisallowAdhocAccess registry option is explicitly set to 0 for the specified provider, and the Ad Hoc Distributed Queries advanced configuration option is enabled. When these options are not set, the default behavior does not allow for ad hoc access.
The OPENDATASOURCE function can be used in the same Transact-SQL syntax locations as a linked-server name. Therefore, OPENDATASOURCE can be used as the first part of a four-part name that refers to a table or view name in a SELECT, INSERT, UPDATE, or DELETE statement, or to a remote stored procedure in an EXECUTE statement. When executing remote stored procedures, OPENDATASOURCE should refer to another instance of SQL Server. OPENDATASOURCE does not accept variables for its arguments.
Like the OPENROWSET function, OPENDATASOURCE should only reference OLE DB data sources that are accessed infrequently. Define a linked server for any data sources accessed more than several times. Neither OPENDATASOURCE nor OPENROWSET provide all the functionality of linked-server definitions, such as security management and the ability to query catalog information. All connection information, including passwords, must be provided every time that OPENDATASOURCE is called.
Important note Important
Windows Authentication is much more secure than SQL Server Authentication. You should use Windows Authentication whenever possible. OPENDATASOURCE should not be used with explicit passwords in the connection string.
The connection requirements for each provider are similar to the requirements for those parameters when creating linked servers. The details for many common providers are listed in the topic sp_addlinkedserver (Transact-SQL).
Any user can execute OPENDATASOURCE. The permissions that are used to connect to the remote server are determined from the connection string.
The following example creates an ad hoc connection to the Payroll instance of SQL Server on server London, and queries the AdventureWorks2008R2.HumanResources.Employee table. (Use SQLNCLI and SQL Server will redirect to the latest version of SQL Server Native Client OLE DB Provider.)
SELECT *
FROM OPENDATASOURCE('SQLNCLI',
    'Data Source=London\Payroll;Integrated Security=SSPI')
    .AdventureWorks2008R2.HumanResources.Employee
The following example creates an ad hoc connection to an Excel spreadsheet in the 1997 - 2003 format.
SELECT * FROM OPENDATASOURCE('Microsoft.Jet.OLEDB.4.0',
'Data Source=C:\DataFolder\Documents\TestExcel.xls;Extended Properties=EXCEL 5.0')...[Sheet1$] ;

 


Executes the specified pass-through query on the specified linked server. This server is an OLE DB data source. OPENQUERY can be referenced in the FROM clause of a query as if it were a table name. OPENQUERY can also be referenced as the target table of an INSERT, UPDATE, or DELETE statement. This is subject to the capabilities of the OLE DB provider. Although the query may return multiple result sets, OPENQUERY returns only the first one.
Topic link iconTransact-SQL Syntax Conventions
OPENQUERY (linked_server ,'query')

linked_server
Is an identifier representing the name of the linked server.
' query '
Is the query string executed in the linked server. The maximum length of the string is 8 KB.
OPENQUERY does not accept variables for its arguments.
OPENQUERY cannot be used to execute extended stored procedures on a linked server. However, an extended stored procedure can be executed on a linked server by using a four-part name. For example:
EXEC SeattleSales.master.dbo.xp_msver
Any user can execute OPENQUERY. The permissions that are used to connect to the remote server are obtained from the settings defined for the linked server.

A. Executing a SELECT pass-through query

The following example creates a linked server named OracleSvr against an Oracle database by using the Microsoft OLE DB Provider for Oracle. Then, this example uses a pass-through SELECT query against this linked server.
Note Note
This example assumes that an Oracle database alias called ORCLDB has been created.
EXEC sp_addlinkedserver 'OracleSvr', 
   'Oracle 7.3', 
   'MSDAORA', 
   'ORCLDB'
GO
SELECT *
FROM OPENQUERY(OracleSvr, 'SELECT name, id FROM joe.titles') 
GO

B. Executing an UPDATE pass-through query

The following example uses a pass-through UPDATE query against the linked server created in example A.
UPDATE OPENQUERY (OracleSvr, 'SELECT name FROM joe.titles WHERE id = 101') 
SET name = 'ADifferentName';

C. Executing an INSERT pass-through query

The following example uses a pass-through INSERT query against the linked server created in example A.
INSERT OPENQUERY (OracleSvr, 'SELECT name FROM joe.titles')
VALUES ('NewTitle');

D. Executing a DELETE pass-through query

The following example uses a pass-through DELETE query to delete the row inserted in example C.
DELETE OPENQUERY (OracleSvr, 'SELECT name FROM joe.titles WHERE name = ''NewTitle''');

OPENROWSET (Transact-SQL)

Includes all connection information that is required to access remote data from an OLE DB data source. This method is an alternative to accessing tables in a linked server and is a one-time, ad hoc method of connecting and accessing remote data by using OLE DB. For more frequent references to OLE DB data sources, use linked servers instead. For more information, see Linking Servers. The OPENROWSET function can be referenced in the FROM clause of a query as if it were a table name. The OPENROWSET function can also be referenced as the target table of an INSERT, UPDATE, or DELETE statement, subject to the capabilities of the OLE DB provider. Although the query might return multiple result sets, OPENROWSET returns only the first one.
OPENROWSET also supports bulk operations through a built-in BULK provider that enables data from a file to be read and returned as a rowset. Topic link icon Transact-SQL Syntax Conventions
OPENROWSET 
( { 'provider_name' , { 'datasource' ; 'user_id' ; 'password' 
   | 'provider_string' } 
   , {   [ catalog. ] [ schema. ] object 
       | 'query' 
     } 
   | BULK 'data_file' , 
       { FORMATFILE = 'format_file_path' [ <bulk_options> ]
       | SINGLE_BLOB | SINGLE_CLOB | SINGLE_NCLOB }
} ) 

<bulk_options> ::=
   [ , CODEPAGE = { 'ACP' | 'OEM' | 'RAW' | 'code_page' } ] 
   [ , ERRORFILE = 'file_name' ]
   [ , FIRSTROW = first_row ] 
   [ , LASTROW = last_row ] 
   [ , MAXERRORS = maximum_errors ] 
   [ , ROWS_PER_BATCH = rows_per_batch ]
   [ , ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) [ UNIQUE ] 

'provider_name'
Is a character string that represents the friendly name (or PROGID) of the OLE DB provider as specified in the registry. provider_name has no default value.
'datasource'
Is a string constant that corresponds to a particular OLE DB data source. datasource is the DBPROP_INIT_DATASOURCE property to be passed to the IDBProperties interface of the provider to initialize the provider. Typically, this string includes the name of the database file, the name of a database server, or a name that the provider understands to locate the database or databases.
'user_id'
Is a string constant that is the user name passed to the specified OLE DB provider. user_id specifies the security context for the connection and is passed in as the DBPROP_AUTH_USERID property to initialize the provider. user_id cannot be a Microsoft Windows login name.
'password'
Is a string constant that is the user password to be passed to the OLE DB provider. password is passed in as the DBPROP_AUTH_PASSWORD property when initializing the provider. password cannot be a Microsoft Windows password.
'provider_string'
Is a provider-specific connection string that is passed in as the DBPROP_INIT_PROVIDERSTRING property to initialize the OLE DB provider. provider_string typically encapsulates all the connection information required to initialize the provider. For a list of keywords that are recognized by the SQL Server Native Client OLE DB provider, see Initialization and Authorization Properties.
catalog
Is the name of the catalog or database in which the specified object resides.
schema
Is the name of the schema or object owner for the specified object.
object
Is the object name that uniquely identifies the object to work with.
'query'
Is a string constant sent to and executed by the provider. The local instance of SQL Server does not process this query, but processes query results returned by the provider, a pass-through query. Pass-through queries are useful when used on providers that do not make available their tabular data through table names, but only through a command language. Pass-through queries are supported on the remote server, as long as the query provider supports the OLE DB Command object and its mandatory interfaces. For more information, see SQL Server Native Client (OLE DB) Reference.
BULK
Uses the BULK rowset provider for OPENROWSET to read data from a file. In SQL Server, OPENROWSET can read from a data file without loading the data into a target table. This lets you use OPENROWSET with a simple SELECT statement. The arguments of the BULK option allow for significant control over where to start and end reading data, how to deal with errors, and how data is interpreted. For example, you can specify that the data file be read as a single-row, single-column rowset of type varbinary, varchar, or nvarchar. The default behavior is described in the argument descriptions that follow. For information about how to use the BULK option, see "Remarks," later in this topic. For information about the permissions that are required by the BULK option, see "Permissions," later in this topic.
Note Note
When used to import data with the full recovery model, OPENROWSET (BULK ...) does not optimize logging.
For information on preparing data for bulk import, see Preparing Data for Bulk Export or Import.
'data_file'
Is the full path of the data file whose data is to be copied into the target table.
FORMATFILE ='format_file_path'
Specifies the full path of a format file. SQL Server supports two types of format files: XML and non-XML. A format file is required to define column types in the result set. The only exception is when SINGLE_CLOB, SINGLE_BLOB, or SINGLE_NCLOB is specified; in which case, the format file is not required. For information about format files, see Using a Format File to Bulk Import Data.
< bulk_options >
Specifies one or more arguments for the BULK option.
CODEPAGE = { 'ACP'| 'OEM'| 'RAW'| 'code_page' }
Specifies the code page of the data in the data file. CODEPAGE is relevant only if the data contains char, varchar, or text columns with character values more than 127 or less than 32.
Note Note
We recommend that you specify a collation name for each column in a format file.
CODEPAGE valueDescription
ACPConverts columns of char, varchar, or text data type from the ANSI/Microsoft Windows code page (ISO 1252) to the SQL Server code page.
OEM (default)Converts columns of char, varchar, or text data type from the system OEM code page to the SQL Server code page.
RAWNo conversion occurs from one code page to another. This is the fastest option.
code_page Indicates the source code page on which the character data in the data file is encoded; for example, 850.
Important noteImportant
SQL Server does not support code page 65001 (UTF-8 encoding).
ERRORFILE ='file_name'
Specifies the file used to collect rows that have formatting errors and cannot be converted to an OLE DB rowset. These rows are copied into this error file from the data file "as is." The error file is created at the start of the command execution. An error will be raised if the file already exists. Additionally, a control file that has the extension .ERROR.txt is created. This file references each row in the error file and provides error diagnostics. After the errors have been corrected, the data can be loaded.
FIRSTROW =first_row
Specifies the number of the first row to load. The default is 1. This indicates the first row in the specified data file. The row numbers are determined by counting the row terminators. FIRSTROW is 1-based.
LASTROW =last_row
Specifies the number of the last row to load. The default is 0. This indicates the last row in the specified data file.
MAXERRORS =maximum_errors
Specifies the maximum number of syntax errors or nonconforming rows, as defined in the format file, that can occur before OPENROWSET throws an exception. Until MAXERRORS is reached, OPENROWSET ignores each bad row, not loading it, and counts the bad row as one error. The default for maximum_errors is 10.
Note Note
MAX_ERRORS does not apply to CHECK constraints, or to converting money and bigint data types.
ROWS_PER_BATCH =rows_per_batch
Specifies the approximate number of rows of data in the data file. This value should be of the same order as the actual number of rows. OPENROWSET always imports a data file as a single batch. However, if you specify rows_per_batch with a value > 0, the query processor uses the value of rows_per_batch as a hint for allocating resources in the query plan. By default, ROWS_PER_BATCH is unknown. Specifying ROWS_PER_BATCH = 0 is the same as omitting ROWS_PER_BATCH.
ORDER ( { column [ ASC | DESC ] } [ ,... n ] [ UNIQUE ] )
An optional hint that specifies how the data in the data file is sorted. By default, the bulk operation assumes the data file is unordered. Performance might improve if the order specified can be exploited by the query optimizer to generate a more efficient query plan. Examples for when specifying a sort can be beneficial include the following:
  • Inserting rows into a table that has a clustered index, where the rowset data is sorted on the clustered index key.
  • Joining the rowset with another table, where the sort and join columns match.
  • Aggregating the rowset data by the sort columns.
  • Using the rowset as a source table in the FROM clause of a query, where the sort and join columns match.
UNIQUE specifies that the data file does not have duplicate entries. If the actual rows in the data file are not sorted according to the order that is specified, or if the UNIQUE hint is specified and duplicates keys are present, an error is returned. Column aliases are required when ORDER is used. The column alias list must reference the derived table that is being accessed by the BULK clause. The column names that are specified in the ORDER clause refer to this column alias list. Large value types (varchar(max), nvarchar(max), varbinary(max), and xml) and large object (LOB) types (text, ntext, and image) columns cannot be specified.
SINGLE_BLOB
Returns the contents of data_file as a single-row, single-column rowset of type varbinary(max).
Important note Important
We recommend that you import XML data only using the SINGLE_BLOB option, rather than SINGLE_CLOB and SINGLE_NCLOB, because only SINGLE_BLOB supports all Windows encoding conversions.
SINGLE_CLOB
By reading data_file as ASCII, returns the contents as a single-row, single-column rowset of type varchar(max), using the collation of the current database.
SINGLE_NCLOB
By reading data_file as UNICODE, returns the contents as a single-row, single-column rowset of type nvarchar(max), using the collation of the current database.
OPENROWSET can be used to access remote data from OLE DB data sources only when the DisallowAdhocAccess registry option is explicitly set to 0 for the specified provider, and the Ad Hoc Distributed Queries advanced configuration option is enabled. When these options are not set, the default behavior does not allow for ad hoc access.
When accessing remote OLE DB data sources, the login identity of trusted connections is not automatically delegated from the server on which the client is connected to the server that is being queried. Authentication delegation must be configured. For more information, see Configuring Linked Servers for Delegation.
Catalog and schema names are required if the OLE DB provider supports multiple catalogs and schemas in the specified data source. Values for catalog and schema can be omitted when the OLE DB provider does not support them. If the provider supports only schema names, a two-part name of the form schema.object must be specified. If the provider supports only catalog names, a three-part name of the form catalog.schema.object must be specified. Three-part names must be specified for pass-through queries that use the SQL Server Native Client OLE DB provider. For more information, see Transact-SQL Syntax Conventions (Transact-SQL).
OPENROWSET does not accept variables for its arguments.

Using OPENROWSET with the BULK Option

The following Transact-SQL enhancements support the OPENROWSET(BULK...) function:
  • A FROM clause that is used with SELECT can call OPENROWSET(BULK...) instead of a table name, with full SELECT functionality.
    OPENROWSET with the BULK option requires a correlation name, also known as a range variable or alias, in the FROM clause. Column aliases can be specified. If a column alias list is not specified, the format file must have column names. Specifying column aliases overrides the column names in the format file, such as:
    FROM OPENROWSET(BULK...) AS table_alias
    FROM OPENROWSET(BULK...) AS table_alias(column_alias,...n)
  • A SELECT...FROM OPENROWSET(BULK...) statement queries the data in a file directly, without importing the data into a table. SELECT…FROM OPENROWSET(BULK...) statements can also list bulk-column aliases by using a format file to specify column names, and also data types.
  • Using OPENROWSET(BULK...) as a source table in an INSERT or MERGE statement bulk imports data from a data file into a SQL Server table. For more information, see Importing Bulk Data by Using BULK INSERT or OPENROWSET(BULK...) .
  • When the OPENROWSET BULK option is used with an INSERT statement, the BULK clause supports table hints. In addition to the regular table hints, such as TABLOCK, the BULK clause can accept the following specialized table hints: IGNORE_CONSTRAINTS (ignores only the CHECK and FOREIGN KEY constraints), IGNORE_TRIGGERS, KEEPDEFAULTS, and KEEPIDENTITY. For more information, see Table Hints (Transact-SQL).
For information about how to use INSERT...SELECT * FROM OPENROWSET(BULK...) statements, see Importing and Exporting Bulk Data. For information about when row-insert operations that are performed by bulk import are logged in the transaction log, see Prerequisites for Minimal Logging in Bulk Import.
Note Note
When you use OPENROWSET, it is important to understand how SQL Server handles impersonation. For information about security considerations, see Importing Bulk Data by Using BULK INSERT or OPENROWSET(BULK...).

Bulk Importing SQLCHAR, SQLNCHAR or SQLBINARY Data

OPENROWSET(BULK...) assumes that, if not specified, the maximum length of SQLCHAR, SQLNCHAR or SQLBINARY data does not exceed 8000 bytes. If the data being imported is in a LOB data field that contains any varchar(max), nvarchar(max), or varbinary(max) objects that exceed 8000 bytes, you must use an XML format file that defines the maximum length for the data field. To specify the maximum length, edit the format file and declare the MAX_LENGTH attribute. For more information, see Schema Syntax for XML Format Files.
Note Note
An automatically generated format file does not specify the length or maximum length for a LOB field. However, you can edit a format file and specify the length or maximum length manually.

Bulk Exporting or Importing SQLXML Documents

To bulk export or import SQLXML data, use one of the following data types in your format file.
Data typeEffect
SQLCHAR or SQLVARYCHARThe data is sent in the client code page or in the code page implied by the collation).
SQLNCHAR or SQLNVARCHARThe data is sent as Unicode.
SQLBINARY or SQLVARYBINThe data is sent without any conversion.
OPENROWSET permissions are determined by the permissions of the user name that is being passed to the OLE DB provider. To use the BULK option requires ADMINISTER BULK OPERATIONS permission.

A. Using OPENROWSET with SELECT and the SQL Server Native Client OLE DB Provider

The following example uses the SQL Server Native Client OLE DB provider to access the HumanResources.Department table in the AdventureWorks2008R2 database on the remote server Seattle1. (Use SQLNCLI and SQL Server will redirect to the latest version of SQL Server Native Client OLE DB Provider.) A SELECT statement is used to define the row set returned. The provider string contains the Server and Trusted_Connection keywords. These keywords are recognized by the SQL Server Native Client OLE DB provider.
SELECT a.*
FROM OPENROWSET('SQLNCLI', 'Server=Seattle1;Trusted_Connection=yes;',
     'SELECT GroupName, Name, DepartmentID
      FROM AdventureWorks2008R2.HumanResources.Department
      ORDER BY GroupName, Name') AS a;

B. Using the Microsoft OLE DB Provider for Jet

The following example accesses the Customers table in the Microsoft Access Northwind database through the Microsoft OLE DB Provider for Jet.
Note Note
This example assumes that Access is installed. To run this example, you must install the Northwind database. For information about how to install the Northwind database, see Downloading Northwind and pubs Sample Databases.
SELECT CustomerID, CompanyName
   FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
      'C:\Program Files\Microsoft Office\OFFICE11\SAMPLES\Northwind.mdb';
      'admin';'',Customers);
GO

C. Using OPENROWSET and another table in an INNER JOIN

The following example selects all data from the Customers table from the local instance of SQL Server Northwind database and from the Orders table from the Access Northwind database stored on the same computer.
Note Note
This example assumes that Access is installed. To run this example, you must install the Northwind database. For information about how to install the Northwind database, see Downloading Northwind and pubs Sample Databases.
USE Northwind  ;
GO
SELECT c.*, o.*
FROM Northwind.dbo.Customers AS c 
   INNER JOIN OPENROWSET('Microsoft.Jet.OLEDB.4.0', 
   'C:\Program Files\Microsoft Office\OFFICE11\SAMPLES\Northwind.mdb';'admin';'', Orders)    
   AS o 
   ON c.CustomerID = o.CustomerID ;
GO

D. Using OPENROWSET to bulk insert file data into a varbinary(max) column

The following example creates a small table for demonstration purposes, and inserts file data from a file named Text1.txt located in the C: root directory into a varbinary(max) column.
USE AdventureWorks2008R2;
GO
CREATE TABLE myTable(FileName nvarchar(60), 
  FileType nvarchar(60), Document varbinary(max));
GO

INSERT INTO myTable(FileName, FileType, Document) 
   SELECT 'Text1.txt' AS FileName, 
      '.txt' AS FileType, 
      * FROM OPENROWSET(BULK N'C:\Text1.txt', SINGLE_BLOB) AS Document;
GO

E. Using the OPENROWSET BULK provider with a format file to retrieve rows from a text file

The following example uses a format file to retrieve rows from a tab-delimited text file, values.txt that contains the following data:
1     Data Item 1
2     Data Item 2
3     Data Item 3
The format file, values.fmt, describes the columns in values.txt:
9.0
2
1  SQLCHAR  0  10 "\t"        1  ID                SQL_Latin1_General_Cp437_BIN
2  SQLCHAR  0  40 "\r\n"      2  Description        SQL_Latin1_General_Cp437_BIN
This is the query that retrieves that data:
SELECT a.* FROM OPENROWSET( BULK 'c:\test\values.txt', 
   FORMATFILE = 'c:\test\values.fmt') AS a;

OPENXML (Transact-SQL)

 
OPENXML provides a rowset view over an XML document. Because OPENXML is a rowset provider, OPENXML can be used in Transact-SQL statements in which rowset providers such as a table, view, or the OPENROWSET function can appear. Topic link iconTransact-SQL Syntax Conventions
OPENXML( idoc int [ in] ,rowpattern nvarchar [ in ] , [ flags byte [ in ] ] ) 
[ WITH ( SchemaDeclaration | TableName ) ]
idoc
Is the document handle of the internal representation of an XML document. The internal representation of an XML document is created by calling sp_xml_preparedocument.
rowpattern
Is the XPath pattern used to identify the nodes (in the XML document whose handle is passed in the idoc parameter) to be processed as rows.
flags
Indicates the mapping that should be used between the XML data and the relational rowset, and how the spill-over column should be filled. flags is an optional input parameter, and can be one of the following values.
Byte valueDescription
0 Defaults to attribute-centric mapping.
1 Use the attribute-centric mapping. Can be combined with XML_ELEMENTS. In this case, attribute-centric mapping is applied first, and then element-centric mapping is applied for all columns that are not yet dealt with.
2 Use the element-centric mapping. Can be combined with XML_ATTRIBUTES. In this case, attribute-centric mapping is applied first, and then element-centric mapping is applied for all columns not yet dealt with.
8 Can be combined (logical OR) with XML_ATTRIBUTES or XML_ELEMENTS. In the context of retrieval, this flag indicates that the consumed data should not be copied to the overflow property @mp:xmltext.
SchemaDeclaration
Is the schema definition of the form: ColNameColType [ColPattern | MetaProperty] [,ColNameColType [ColPattern | MetaProperty]...]
ColName
Is the column name in the rowset.
ColType
Is the SQL Server data type of the column in the rowset. If the column types differ from the underlying xml data type of the attribute, type coercion occurs.
ColPattern
Is an optional, general XPath pattern that describes how the XML nodes should be mapped to the columns. If ColPattern is not specified, the default mapping (attribute-centric or element-centric mapping as specified by flags) takes place. The XPath pattern specified as ColPattern is used to specify the special nature of the mapping (in the case of attribute-centric and element-centric mapping) that overwrites or enhances the default mapping indicated by flags. The general XPath pattern specified as ColPattern also supports the metaproperties.
MetaProperty
Is one of the metaproperties provided by OPENXML. If MetaProperty is specified, the column contains information provided by the metaproperty. The metaproperties allow you to extract information (such as relative position and namespace information) about XML nodes. This provides more information than is visible in the textual representation.

TableName

Is the table name that can be given (instead of SchemaDeclaration) if a table with the desired schema already exists and no column patterns are required.
The WITH clause provides a rowset format (and additional mapping information as required) by using either SchemaDeclaration or specifying an existing TableName. If the optional WITH clause is not specified, the results are returned in an edge table format. Edge tables represent the fine-grained XML document structure (such as element/attribute names, the document hierarchy, the namespaces, PIs, and son on) in a single table.
The following table describes the structure of the edge table.
Column nameData typeDescription
id bigint Is the unique ID of the document node.
The root element has an ID value 0. The negative ID values are reserved.
parentid bigint Identifies the parent of the node. The parent identified by this ID is not necessarily the parent element, but it depends on the NodeType of the node whose parent is identified by this ID. For example, if the node is a text node, the parent of it may be an attribute node.
If the node is at the top level in the XML document, its ParentID is NULL.
nodetype int Identifies the node type. Is an integer that corresponds to the XML DOM node type numbering.
The node types are:
1 = Element node
2 = Attribute node
3 = Text node
localname nvarchar Gives the local name of the element or attribute. Is NULL if the DOM object does not have a name.
prefix nvarchar Is the namespace prefix of the node name.
namespaceuri nvarchar Is the namespace URI of the node. If the value is NULL, no namespace is present.
datatype nvarchar Is the actual data type of the element or attribute row, otherwise is NULL. The data type is inferred from the inline DTD or from the inline schema.
prev bigint Is the XML ID of the previous sibling element. Is NULL if there is no direct previous sibling.
text ntext Contains the attribute value or the element content in text form (or is NULL if the edge table entry does not require a value).

A. Using a simple SELECT statement with OPENXML

The following example creates an internal representation of the XML image by using sp_xml_preparedocument. A SELECT statement that uses an OPENXML rowset provider is then executed against the internal representation of the XML document.
The flag value is set to 1. This indicates attribute-centric mapping. Therefore, the XML attributes map to the columns in the rowset. The rowpattern specified as /ROOT/Customer identifies the <Customers> nodes to be processed.
The optional ColPattern (column pattern) parameter is not specified because the column name matches the XML attribute names.
The OPENXML rowset provider creates a two-column rowset (CustomerID and ContactName) from which the SELECT statement retrieves the necessary columns (in this case, all the columns).
DECLARE @idoc int
DECLARE @doc varchar(1000)
SET @doc ='
<ROOT>
<Customer CustomerID="VINET" ContactName="Paul Henriot">
   <Order CustomerID="VINET" EmployeeID="5" OrderDate="1996-07-04T00:00:00">
      <OrderDetail OrderID="10248" ProductID="11" Quantity="12"/>
      <OrderDetail OrderID="10248" ProductID="42" Quantity="10"/>
   </Order>
</Customer>
<Customer CustomerID="LILAS" ContactName="Carlos Gonzlez">
   <Order CustomerID="LILAS" EmployeeID="3" OrderDate="1996-08-16T00:00:00">
      <OrderDetail OrderID="10283" ProductID="72" Quantity="3"/>
   </Order>
</Customer>
</ROOT>'
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- Execute a SELECT statement that uses the OPENXML rowset provider.
SELECT    *
FROM       OPENXML (@idoc, '/ROOT/Customer',1)
            WITH (CustomerID  varchar(10),
                  ContactName varchar(20))
Here is the result set.
CustomerID ContactName          
---------- -------------------- 
VINET      Paul Henriot
LILAS      Carlos Gonzlez
If the same SELECT statement is executed with flags set to 2, indicating element-centric mapping, the values of CustomerID and ContactName for both of the customers in the XML document are returned as NULL, because the <Customers> elements do not have any subelements.
Here is the result set.
CustomerID ContactName
---------- -----------
NULL       NULL
NULL       NULL

B. Specifying ColPattern for mapping between columns and the XML attributes

The following query returns customer ID, order date, product ID and quantity attributes from the XML document. The rowpattern identifies the <OrderDetails> elements. ProductID and Quantity are the attributes of the <OrderDetails> element. However, OrderID, CustomerID, and OrderDate are the attributes of the parent element (<Orders>).
The optional ColPattern is specified. This indicates the following:
  • The OrderID, CustomerID, and OrderDate in the rowset map to the attributes of the parent of the nodes identified by rowpattern in the XML document.
  • The ProdID column in the rowset maps to the ProductID attribute, and the Qty column in the rowset maps to the Quantity attribute of the nodes identified in rowpattern.
Although the element-centric mapping is specified by the flags parameter, the mapping specified in ColPattern overwrites this mapping.
DECLARE @idoc int
DECLARE @doc varchar(1000)
SET @doc ='
<ROOT>
<Customer CustomerID="VINET" ContactName="Paul Henriot">
   <Order OrderID="10248" CustomerID="VINET" EmployeeID="5" 
           OrderDate="1996-07-04T00:00:00">
      <OrderDetail ProductID="11" Quantity="12"/>
      <OrderDetail ProductID="42" Quantity="10"/>
   </Order>
</Customer>
<Customer CustomerID="LILAS" ContactName="Carlos Gonzlez">
   <Order OrderID="10283" CustomerID="LILAS" EmployeeID="3" 
           OrderDate="1996-08-16T00:00:00">
      <OrderDetail ProductID="72" Quantity="3"/>
   </Order>
</Customer>
</ROOT>'
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- SELECT stmt using OPENXML rowset provider
SELECT *
FROM   OPENXML (@idoc, '/ROOT/Customer/Order/OrderDetail',2)
         WITH (OrderID       int         '../@OrderID',
               CustomerID  varchar(10) '../@CustomerID',
               OrderDate   datetime    '../@OrderDate',
               ProdID      int         '@ProductID',
               Qty         int         '@Quantity')
Here is the result set.
OrderID CustomerID           OrderDate                 ProdID    Qty
------------------------------------------------------------------------
10248      VINET       1996-07-04 00:00:00.000   11      12
10248      VINET       1996-07-04 00:00:00.000   42      10
10283      LILAS       1996-08-16 00:00:00.000   72      3

C. Obtaining results in an edge table format

The sample XML document in the following example consists of <Customers>, <Orders>, and <Order_0020_Details> elements. First, sp_xml_preparedocument is called to obtain a document handle. This document handle is passed to OPENXML.
In the OPENXML statement, the rowpattern (/ROOT/Customers) identifies the <Customers> nodes to process. Because the WITH clause is not provided, OPENXML returns the rowset in an edge table format.
Finally the SELECT statement retrieves all the columns in the edge table.
DECLARE @idoc int
DECLARE @doc varchar(1000)
SET @doc ='
<ROOT>
<Customers CustomerID="VINET" ContactName="Paul Henriot">
   <Orders CustomerID="VINET" EmployeeID="5" OrderDate=
           "1996-07-04T00:00:00">
      <Order_x0020_Details OrderID="10248" ProductID="11" Quantity="12"/>
      <Order_x0020_Details OrderID="10248" ProductID="42" Quantity="10"/>
   </Orders>
</Customers>
<Customers CustomerID="LILAS" ContactName="Carlos Gonzlez">
   <Orders CustomerID="LILAS" EmployeeID="3" OrderDate=
           "1996-08-16T00:00:00">
      <Order_x0020_Details OrderID="10283" ProductID="72" Quantity="3"/>
   </Orders>
</Customers>
</ROOT>'
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- SELECT statement that uses the OPENXML rowset provider.
SELECT    *
FROM       OPENXML (@idoc, '/ROOT/Customers')
EXEC sp_xml_removedocument @idoc
 

Aggregate Functions (Transact-SQL)

 
Aggregate functions perform a calculation on a set of values and return a single value. Except for COUNT, aggregate functions ignore null values. Aggregate functions are frequently used with the GROUP BY clause of the SELECT statement. All aggregate functions are deterministic. This means aggregate functions return the same value any time that they are called by using a specific set of input values. For more information about function determinism, see Deterministic and Nondeterministic Functions. The OVER clause may follow all aggregate functions except CHECKSUM. Aggregate functions can be used as expressions only in the following:
  • The select list of a SELECT statement (either a subquery or an outer query).
  • A COMPUTE or COMPUTE BY clause.
  • A HAVING clause.
Transact-SQL provides the following aggregate functions:
AVGMIN
CHECKSUM_AGGOVER Clause
COUNTROWCOUNT_BIG
COUNT_BIGSTDEV
GROUPINGSTDEVP
GROUPING_IDSUM
MAXVAR

AVG (Transact-SQL)

 

Returns the average of the values in a group. Null values are ignored. May be followed by the OVER clause.
Topic link iconTransact-SQL Syntax Conventions
AVG ( [ ALL | DISTINCT ] expression )

ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that AVG be performed only on each unique instance of a value, regardless of how many times the value occurs.
expression
Is an expression of the exact numeric or approximate numeric data type category, except for the bit data type. Aggregate functions and subqueries are not permitted.
The return type is determined by the type of the evaluated result of expression.
Expression resultReturn type
tinyint int
smallint int
int int
bigint bigint
decimal category (p, s)decimal(38, s) divided by decimal(10, 0)
money and smallmoney categorymoney
float and real categoryfloat
If the data type of expression is an alias data type, the return type is also of the alias data type. However, if the base data type of the alias data type is promoted, for example from tinyint to int, the return value is of the promoted data type and not the alias data type.
AVG () computes the average of a set of values by dividing the sum of those values by the count of nonnull values. If the sum exceeds the maximum value for the data type of the return value an error will be returned.

A. Using the SUM and AVG functions for calculations

The following example calculates the average vacation hours and the sum of sick leave hours that the vice presidents of Adventure Works Cycles have used. Each of these aggregate functions produces a single summary value for all the retrieved rows.
USE AdventureWorks2008R2;
GO
SELECT AVG(VacationHours)AS 'Average vacation hours', 
    SUM  (SickLeaveHours) AS 'Total sick leave hours'
FROM HumanResources.Employee
WHERE JobTitle LIKE 'Vice President%';
Here is the result set. Average vacation hours Total sick leave hours ---------------------- ---------------------- 25 97 (1 row(s) affected)

B. Using the SUM and AVG functions with a GROUP BY clause

When used with a GROUP BY clause, each aggregate function produces a single value for each group, instead of for the whole table. The following example produces summary values for each sales territory. The summary lists the average bonus received by the sales people in each territory and the sum of year-to-date sales for each territory.
USE AdventureWorks2008R2;
GO
SELECT TerritoryID, AVG(Bonus)as 'Average bonus', SUM(SalesYTD) as 'YTD sales'
FROM Sales.SalesPerson
GROUP BY TerritoryID;
GO

C. Using AVG with DISTINCT

The following statement returns the average list price of products.
USE AdventureWorks2008R2;
GO
SELECT AVG(DISTINCT ListPrice)
FROM Production.Product;
Here is the result set. ------------------------------ 437.4042 (1 row(s) affected)

D. Using AVG without DISTINCT

Without DISTINCT, the AVG function finds the average list price of all products in the Product table.
USE AdventureWorks2008R2;
GO
SELECT AVG(ListPrice)
FROM Production.Product;
Here is the result set. ------------------------------ 438.6662 (1 row(s) affected)
VARP

CHECKSUM_AGG (Transact-SQL)

 

Returns the checksum of the values in a group. Null values are ignored. Can be followed by the OVER clause.
Topic link iconTransact-SQL Syntax Conventions
CHECKSUM_AGG ( [ ALL | DISTINCT ] expression )

ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that CHECKSUM_AGG returns the checksum of unique values.
expression
Is an integer expression. Aggregate functions and subqueries are not allowed.
Returns the checksum of all expression values as int.
CHECKSUM_AGG can be used to detect changes in a table.
The order of the rows in the table does not affect the result of CHECKSUM_AGG. Also, CHECKSUM_AGG functions may be used with the DISTINCT keyword and the GROUP BY clause.
If one of the values in the expression list changes, the checksum of the list also generally changes. However, there is a small chance that the checksum will not change.
CHECKSUM_AGG has similar functionality with other aggregate functions. For more information, see Aggregate Functions (Transact-SQL).
The following example uses CHECKSUM_AGG to detect changes in the Quantity column of the ProductInventory table in the AdventureWorks2008R2 database.
--Get the checksum value before the column value is changed.
USE AdventureWorks2008R2;
GO
SELECT CHECKSUM_AGG(CAST(Quantity AS int))
FROM Production.ProductInventory;
GO
Here is the result set.
------------------------
262
UPDATE Production.ProductInventory 
SET Quantity=125
WHERE Quantity=100;
GO
--Get the checksum of the modified column.
SELECT CHECKSUM_AGG(CAST(Quantity AS int))
FROM Production.ProductInventory;
Here is the result set.
------------------------
287

COUNT (Transact-SQL)

 

Returns the number of items in a group. COUNT works like the COUNT_BIG function. The only difference between the two functions is their return values. COUNT always returns an int data type value. COUNT_BIG always returns a bigint data type value. May be followed by the OVER clause.
COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )

ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that COUNT returns the number of unique nonnull values.
expression
Is an expression of any type except text, image, or ntext. Aggregate functions and subqueries are not permitted.
*
Specifies that all rows should be counted to return the total number of rows in a table. COUNT(*) takes no parameters and cannot be used with DISTINCT. COUNT(*) does not require an expression parameter because, by definition, it does not use information about any particular column. COUNT(*) returns the number of rows in a specified table without getting rid of duplicates. It counts each row separately. This includes rows that contain null values.
COUNT(*) returns the number of items in a group. This includes NULL values and duplicates.
COUNT(ALL expression) evaluates expression for each row in a group and returns the number of nonnull values.
COUNT(DISTINCT expression) evaluates expression for each row in a group and returns the number of unique, nonnull values.
For return values greater than 2^31-1, COUNT produces an error. Use COUNT_BIG instead.

A. Using COUNT and DISTINCT

The following example lists the number of different titles that an employee who works at Adventure Works Cycles can hold.
USE AdventureWorks2008R2;
GO
SELECT COUNT(DISTINCT JobTitle)
FROM HumanResources.Employee;
GO
Here is the result set.
-----------
67

(1 row(s) affected)

B. Using COUNT(*)

The following example finds the total number of employees who work at Adventure Works Cycles.
USE AdventureWorks2008R2;
GO
SELECT COUNT(*)
FROM HumanResources.Employee;
GO
Here is the result set.
-----------
290

(1 row(s) affected)

C. Using COUNT(*) with other aggregates

The following example shows that COUNT(*) can be combined with other aggregate functions in the select list.
USE AdventureWorks2008R2;
GO
SELECT COUNT(*), AVG(Bonus)
FROM Sales.SalesPerson
WHERE SalesQuota > 25000;
GO
Here is the result set.
----------- ---------------------
14 3472.1428
(1 row(s) affected)

COUNT_BIG (Transact-SQL)

 

Returns the number of items in a group. COUNT_BIG works like the COUNT function. The only difference between the two functions is their return values. COUNT_BIG always returns a bigint data type value. COUNT always returns an int data type value. May be followed by the OVER Clause (Transact-SQL).
Topic link iconTransact-SQL Syntax Conventions
COUNT_BIG ( { [ ALL | DISTINCT ] expression } | * )

ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that COUNT_BIG returns the number of unique nonnull values.
expression
Is an expression of any type. Aggregate functions and subqueries are not permitted.
*
Specifies that all rows should be counted to return the total number of rows in a table. COUNT_BIG(*) takes no parameters and cannot be used with DISTINCT. COUNT_BIG(*) does not require an expression parameter because, by definition, it does not use information about any particular column. COUNT_BIG(*) returns the number of rows in a specified table without getting rid of duplicates. It counts each row separately. This includes rows that contain null values.
bigint
COUNT_BIG(*) returns the number of items in a group. This includes NULL values and duplicates.
COUNT_BIG(ALL expression) evaluates expression for each row in a group and returns the number of nonnull values.
COUNT_BIG(DISTINCT expression) evaluates expression for each row in a group and returns the number of unique, nonnull values

GROUPING (Transact-SQL)

 

Indicates whether a specified column expression in a GROUP BY list is aggregated or not. GROUPING returns 1 for aggregated or 0 for not aggregated in the result set. GROUPING can be used only in the SELECT <select> list, HAVING, and ORDER BY clauses when GROUP BY is specified.
Topic link iconTransact-SQL Syntax Conventions
GROUPING (<column_expression>)

<column_expression>
Is a column or an expression that contains a column in a GROUP BY clause.
tinyint
GROUPING is used to distinguish the null values that are returned by ROLLUP, CUBE or GROUPING SETS from standard null values. The NULL returned as the result of a ROLLUP, CUBE or GROUPING SETS operation is a special use of NULL. This acts as a column placeholder in the result set and means all.
The following example groups SalesQuota and aggregates SaleYTD amounts. The GROUPING function is applied to the SalesQuota column.
USE AdventureWorks2008R2;
GO
SELECT SalesQuota, SUM(SalesYTD) 'TotalSalesYTD', GROUPING(SalesQuota) AS 'Grouping'
FROM Sales.SalesPerson
GROUP BY SalesQuota WITH ROLLUP;
GO
The result set shows two null values under SalesQuota. The first NULL represents the group of null values from this column in the table. The second NULL is in the summary row added by the ROLLUP operation. The summary row shows the TotalSalesYTD amounts for all SalesQuota groups and is indicated by 1 in the Grouping column.
Here is the result set.
SalesQuota TotalSalesYTD Grouping
--------- ------------- --------
NULL 1533087.5999 0
250000.00 33461260.59 0
300000.00 9299677.9445 0
NULL 44294026.1344 1
(4 row(s) affected)

GROUPING_ID (Transact-SQL)

 

Is a function that computes the level of grouping. GROUPING_ID can be used only in the SELECT <select> list, HAVING, or ORDER BY clauses when GROUP BY is specified.
Topic link iconTransact-SQL Syntax Conventions
GROUPING_ID ( <column_expression>[ ,...n ] )

<column_expression>
Is a column_expression in a GROUP BY clause.
The GROUPING_ID <column_expression> must exactly match the expression in the GROUP BY list. For example, if you are grouping by DATEPART (yyyy, <column name>), use GROUPING_ID (DATEPART (yyyy, <column name>)); or if you are grouping by <column name>, use GROUPING_ID (<column name>).

Comparing GROUPING_ID () to GROUPING ()

GROUPING_ID (<column_expression> [ ,...n ]) inputs the equivalent of the GROUPING (<column_expression>) return for each column in its column list in each output row as a string of ones and zeros. GROUPING_ID interprets that string as a base-2 number and returns the equivalent integer. For example consider the following statement: SELECT a, b, c, SUM(d),GROUPING_ID(a,b,c)FROM T GROUP BY <group by list>. The following table shows the GROUPING_ID () input and output values.
Columns aggregatedGROUPING_ID (a, b, c) input = GROUPING(a) + GROUPING(b) + GROUPING(c)GROUPING_ID () output
a 100 4
b 010 2
c 001 1
ab 110 6
ac 101 5
bc 011 3
abc 111 7

Technical Definition of GROUPING_ID ()

Each GROUPING_ID argument must be an element of the GROUP BY list. GROUPING_ID () returns an integer bitmap whose lowest N bits may be lit. A lit bit indicates the corresponding argument is not a grouping column for the given output row. The lowest-order bit corresponds to argument N, and the N-1th lowest-order bit corresponds to argument 1.

GROUPING_ID () Equivalents

For a single grouping query, GROUPING (<column_expression>) is equivalent to GROUPING_ID (<column_expression>), and both return 0.
For example, the following statements are equivalent:
SELECT GROUPING_ID(A,B)
FROM T 
GROUP BY CUBE(A,B) 
SELECT 3 FROM T GROUP BY ()
UNION ALL
SELECT 1 FROM T GROUP BY A
UNION ALL
SELECT 2 FROM T GROUP BY B
UNION ALL
SELECT 0 FROM T GROUP BY A,B

A. Using GROUPING_ID to identify grouping levels

The following example returns the count of employees by Name and Title, Name, and company total. GROUPING_ID() is used to create a value for each row in the Title column that identifies its level of aggregation.
USE AdventureWorks2008R2;
GO
SELECT D.Name
    ,CASE 
    WHEN GROUPING_ID(D.Name, E.JobTitle) = 0 THEN E.JobTitle
    WHEN GROUPING_ID(D.Name, E.JobTitle) = 1 THEN N'Total: ' + D.Name 
    WHEN GROUPING_ID(D.Name, E.JobTitle) = 3 THEN N'Company Total:'
        ELSE N'Unknown'
    END AS N'Job Title'
    ,COUNT(E.BusinessEntityID) AS N'Employee Count'
FROM HumanResources.Employee E
    INNER JOIN HumanResources.EmployeeDepartmentHistory DH
        ON E.BusinessEntityID = DH.BusinessEntityID
    INNER JOIN HumanResources.Department D
        ON D.DepartmentID = DH.DepartmentID     
WHERE DH.EndDate IS NULL
    AND D.DepartmentID IN (12,14)
GROUP BY ROLLUP(D.Name, E.JobTitle);

B. Using GROUPING_ID to filter a result set

Simple Example

In the following code, to return only the rows that have a count of employees by title, remove the comment characters from HAVING GROUPING_ID(D.Name, E.JobTitle); = 0. To return only rows with a count of employees by department, remove the comment characters from HAVING GROUPING_ID(D.Name, E.JobTitle) = 1;.
USE AdventureWorks2008R2;
GO
SELECT D.Name
    ,E.JobTitle
    ,GROUPING_ID(D.Name, E.JobTitle) AS 'Grouping Level'
    ,COUNT(E.BusinessEntityID) AS N'Employee Count'
FROM HumanResources.Employee AS E
    INNER JOIN HumanResources.EmployeeDepartmentHistory AS DH
        ON E.BusinessEntityID = DH.BusinessEntityID
    INNER JOIN HumanResources.Department AS D
        ON D.DepartmentID = DH.DepartmentID     
WHERE DH.EndDate IS NULL
    AND D.DepartmentID IN (12,14)
GROUP BY ROLLUP(D.Name, E.JobTitle)
--HAVING GROUPING_ID(D.Name, E.JobTitle) = 0; --All titles
--HAVING GROUPING_ID(D.Name, E.JobTitle) = 1; --Group by Name;
Here is the unfiltered result set.
NameTitleGrouping LevelEmployee CountName
Document ControlControl Specialist02Document Control
Document ControlDocument Control Assistant02Document Control
Document ControlDocument Control Manager01Document Control
Document ControlNULL15Document Control
Facilities and MaintenanceFacilities Administrative Assistant01Facilities and Maintenance
Facilities and MaintenanceFacilities Manager01Facilities and Maintenance
Facilities and MaintenanceJanitor04Facilities and Maintenance
Facilities and MaintenanceMaintenance Supervisor01Facilities and Maintenance
Facilities and MaintenanceNULL17Facilities and Maintenance
NULLNULL312NULL

Complex Example

In the following example, GROUPING_ID() is used to filter a result set that contains multiple grouping levels by grouping level. Similar code can be used to create a view that has several grouping levels and a stored procedure that calls the view by passing a parameter that filters the view by grouping level.
USE AdventureWorks2008R2;
GO
DECLARE @Grouping nvarchar(50);
DECLARE @GroupingLevel smallint;
SET @Grouping = N'CountryRegionCode Total';

SELECT @GroupingLevel = (
    CASE @Grouping
        WHEN N'Grand Total'             THEN 15
        WHEN N'SalesPerson Total'       THEN 14
        WHEN N'Store Total'             THEN 13
        WHEN N'Store SalesPerson Total' THEN 12
        WHEN N'CountryRegionCode Total' THEN 11
        WHEN N'Group Total'             THEN 7
        ELSE N'Unknown'
    END);

SELECT 
    T.[Group]
    ,T.CountryRegionCode
    ,S.Name AS N'Store'
    ,(SELECT P.FirstName + ' ' + P.LastName 
        FROM Person.Person AS P 
        WHERE P.BusinessEntityID = H.SalesPersonID)
        AS N'Sales Person'
    ,SUM(TotalDue)AS N'TotalSold'
    ,CAST(GROUPING(T.[Group])AS char(1)) + 
        CAST(GROUPING(T.CountryRegionCode)AS char(1)) + 
        CAST(GROUPING(S.Name)AS char(1)) + 
        CAST(GROUPING(H.SalesPersonID)AS char(1)) 
        AS N'GROUPING base-2'
    ,GROUPING_ID((T.[Group])
        ,(T.CountryRegionCode),(S.Name),(H.SalesPersonID)
        ) AS N'GROUPING_ID'
    ,CASE 
        WHEN GROUPING_ID(
            (T.[Group]),(T.CountryRegionCode)
            ,(S.Name),(H.SalesPersonID)
            ) = 15 THEN N'Grand Total'
        WHEN GROUPING_ID(
            (T.[Group]),(T.CountryRegionCode)
            ,(S.Name),(H.SalesPersonID)
            ) = 14 THEN N'SalesPerson Total'
        WHEN GROUPING_ID(
            (T.[Group]),(T.CountryRegionCode)
            ,(S.Name),(H.SalesPersonID)
            ) = 13 THEN N'Store Total'
        WHEN GROUPING_ID(
            (T.[Group]),(T.CountryRegionCode)
            ,(S.Name),(H.SalesPersonID)
            ) = 12 THEN N'Store SalesPerson Total'
        WHEN GROUPING_ID(
            (T.[Group]),(T.CountryRegionCode)
            ,(S.Name),(H.SalesPersonID)
            ) = 11 THEN N'CountryRegionCode Total'
        WHEN GROUPING_ID(
            (T.[Group]),(T.CountryRegionCode)
            ,(S.Name),(H.SalesPersonID)
            ) =  7 THEN N'Group Total'
        ELSE N'Error'
        END AS N'Level'
FROM Sales.Customer AS C
    INNER JOIN Sales.Store AS S
        ON C.StoreID  = S.BusinessEntityID 
    INNER JOIN Sales.SalesTerritory AS T
        ON C.TerritoryID  = T.TerritoryID 
    INNER JOIN Sales.SalesOrderHeader AS H
        ON C.CustomerID = H.CustomerID
GROUP BY GROUPING SETS ((S.Name,H.SalesPersonID)
    ,(H.SalesPersonID),(S.Name)
    ,(T.[Group]),(T.CountryRegionCode),()
    )
HAVING GROUPING_ID(
    (T.[Group]),(T.CountryRegionCode),(S.Name),(H.SalesPersonID)
    ) = @GroupingLevel
ORDER BY 
    GROUPING_ID(S.Name,H.SalesPersonID),GROUPING_ID((T.[Group])
    ,(T.CountryRegionCode)
    ,(S.Name)
    ,(H.SalesPersonID))ASC;

C. Using GROUPING_ID () with ROLLUP and CUBE to identify grouping levels

The code in the following examples show using GROUPING() to compute the Bit Vector(base-2) column. GROUPING_ID() is used to compute the corresponding Integer Equivalent column. The column order in the GROUPING_ID() function is the opposite of the column order of the columns that are concatenated by the GROUPING() function.
In these examples, GROUPING_ID() is used to create a value for each row in the Grouping Level column to identify the level of grouping. Grouping levels are not always a consecutive list of integers that start with 1 (0, 1, 2,...n).
Note Note
GROUPING and GROUPING_ID can be used n a HAVING clause to filter a result set.

ROLLUP Example

In this example, all grouping levels do not appear as they do in the following CUBE example. If the order of the columns in the ROLLUP list is changed, the level values in the Grouping Level column will also have to be changed.
USE AdventureWorks2008R2;
GO
SELECT DATEPART(yyyy,OrderDate) AS N'Year'
    ,DATEPART(mm,OrderDate) AS N'Month'
    ,DATEPART(dd,OrderDate) AS N'Day'
    ,SUM(TotalDue) AS N'Total Due'
    ,CAST(GROUPING(DATEPART(dd,OrderDate))AS char(1)) + 
        CAST(GROUPING(DATEPART(mm,OrderDate))AS char(1)) + 
        CAST(GROUPING(DATEPART(yyyy,OrderDate))AS char(1)) 
     AS N'Bit Vector(base-2)'
    ,GROUPING_ID(DATEPART(yyyy,OrderDate)
        ,DATEPART(mm,OrderDate)
        ,DATEPART(dd,OrderDate)) 
        AS N'Integer Equivalent'
    ,CASE
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 0 THEN N'Year Month Day'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 1 THEN N'Year Month'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 2 THEN N'not used'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 3 THEN N'Year'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 4 THEN N'not used'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 5 THEN N'not used'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 6 THEN N'not used'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 7 THEN N'Grand Total'
    ELSE N'Error'
    END AS N'Grouping Level'
FROM Sales.SalesOrderHeader
WHERE DATEPART(yyyy,OrderDate) IN(N'2007',N'2008')
    AND DATEPART(mm,OrderDate) IN(1,2)
    AND DATEPART(dd,OrderDate) IN(1,2)
GROUP BY ROLLUP(DATEPART(yyyy,OrderDate)
        ,DATEPART(mm,OrderDate)
        ,DATEPART(dd,OrderDate))
ORDER BY GROUPING_ID(DATEPART(mm,OrderDate)
    ,DATEPART(yyyy,OrderDate)
    ,DATEPART(dd,OrderDate)
    )
    ,DATEPART(yyyy,OrderDate)
    ,DATEPART(mm,OrderDate)
    ,DATEPART(dd,OrderDate);
Here is a partial result set.
YearMonthDayTotal DueBit Vector (base-2)Integer EquivalentGrouping Level
2007111497452.60660000Year Month Day
20071221772.34940000Year Month Day
2007212705653.59130000Year Month Day
20072221684.40680000Year Month Day
2008111908122.09670000Year Month Day
20081246458.06910000Year Month Day
2008213108771.97290000Year Month Day
20082254598.54880000Year Month Day
20071NULL1519224.9561001Year Month
20072NULL2727337.99811001Year Month
20081NULL1954580.16581001Year Month
20082NULL3163370.52171001Year Month
2007NULLNULL4246562.95411103Year
2008NULLNULL5117950.68751103Year
NULLNULLNULL9364513.64161117Grand Total

CUBE Example

In this example, the GROUPING_ID() function is used to create a value for each row in the Grouping Level column to identify the level of grouping.
Unlike ROLLUP in the previous example, CUBE outputs all grouping levels. If the order of the columns in the CUBE list is changed, the level values in the Grouping Level column will also have to be changed.
USE AdventureWorks2008R2;
GO
SELECT DATEPART(yyyy,OrderDate) AS N'Year'
    ,DATEPART(mm,OrderDate) AS N'Month'
    ,DATEPART(dd,OrderDate) AS N'Day'
    ,SUM(TotalDue) AS N'Total Due'
    ,CAST(GROUPING(DATEPART(dd,OrderDate))AS char(1)) + 
        CAST(GROUPING(DATEPART(mm,OrderDate))AS char(1)) + 
        CAST(GROUPING(DATEPART(yyyy,OrderDate))AS char(1)) 
        AS N'Bit Vector(base-2)'
    ,GROUPING_ID(DATEPART(yyyy,OrderDate)
        ,DATEPART(mm,OrderDate)
        ,DATEPART(dd,OrderDate)) 
        AS N'Integer Equivalent'
    ,CASE
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate)
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 0 THEN N'Year Month Day'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate) 
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 1 THEN N'Year Month'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate) 
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 2 THEN N'Year Day'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate) 
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 3 THEN N'Year'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate) 
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 4 THEN N'Month Day'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate) 
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 5 THEN N'Month'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate) 
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 6 THEN N'Day'
        WHEN GROUPING_ID(DATEPART(yyyy,OrderDate) 
            ,DATEPART(mm,OrderDate),DATEPART(dd,OrderDate)
            ) = 7 THEN N'Grand Total'
    ELSE N'Error'
    END AS N'Grouping Level'
FROM Sales.SalesOrderHeader
WHERE DATEPART(yyyy,OrderDate) IN(N'2007',N'2008')
    AND DATEPART(mm,OrderDate) IN(1,2)
    AND DATEPART(dd,OrderDate) IN(1,2)
GROUP BY CUBE(DATEPART(yyyy,OrderDate)
    ,DATEPART(mm,OrderDate)
    ,DATEPART(dd,OrderDate))
ORDER BY GROUPING_ID(DATEPART(yyyy,OrderDate)
    ,DATEPART(mm,OrderDate)
    ,DATEPART(dd,OrderDate)
    )
    ,DATEPART(yyyy,OrderDate)
    ,DATEPART(mm,OrderDate)
    ,DATEPART(dd,OrderDate);
Here is a partial result set.
YearMonthDayTotal DueBit Vector (base-2)Integer EquivalentGrouping Level
2007111497452.60660000Year Month Day
20071221772.34940000Year Month Day
2007212705653.59130000Year Month Day
20072221684.40680000Year Month Day
2008111908122.09670000Year Month Day
20081246458.06910000Year Month Day
2008213108771.97290000Year Month Day
20082254598.54880000Year Month Day
20071NULL1519224.9561001Year Month
20072NULL2727337.99811001Year Month
20081NULL1954580.16581001Year Month
20082NULL3163370.52171001Year Month
2007NULL14203106.19790102Year Day
2007NULL243456.75620102Year Day
2008NULL15016894.06960102Year Day
2008NULL2101056.61790102Year Day
2007NULLNULL4246562.95411103Year
2008NULLNULL5117950.68751103Year
NULL113405574.70330014Month Day
NULL1268230.41850014Month Day
NULL215814425.56420014Month Day
NULL2276282.95560014Month Day
NULL1NULL3473805.12181015Month
NULL2NULL5890708.51981015Month
NULLNULL19220000.26750116Day
NULLNULL2144513.37410116Day
NULLNULLNULL9364513.64161117Grand Total

MAX (Transact-SQL)


Returns the maximum value in the expression. May be followed by the OVER clause.
MAX ( [ ALL | DISTINCT ] expression )

ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered. DISTINCT is not meaningful with MAX and is available for ISO compatibility only.
expression
Is a constant, column name, or function, and any combination of arithmetic, bitwise, and string operators. MAX can be used with numeric, character, and datetime columns, but not with bit columns. Aggregate functions and subqueries are not permitted. For more information, see Expressions (Transact-SQL).
Returns a value same as expression.
MAX ignores any null values.
For character columns, MAX finds the highest value in the collating sequence.
The following example returns the highest (maximum) tax rate.
USE AdventureWorks2008R2;
GO
SELECT MAX(TaxRate)
FROM Sales.SalesTaxRate;
GO
Here is the result set.
------
19.60
(1 row(s) affected)
Warning, null value eliminated from aggregate.

MIN (Transact-SQL)

 

Returns the minimum value in the expression. May be followed by the OVER clause.
Topic link iconTransact-SQL Syntax Conventions
MIN ( [ ALL | DISTINCT ] expression )

ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered. DISTINCT is not meaningful with MIN and is available for ISO compatibility only.
expression
Is a constant, column name, or function, and any combination of arithmetic, bitwise, and string operators. MIN can be used with numeric, char, varchar, or datetime columns, but not with bit columns. Aggregate functions and subqueries are not permitted. For more information, see Expressions (Transact-SQL).
Returns a value same as expression.
MIN ignores any null values.
With character data columns, MIN finds the value that is lowest in the sort sequence.
The following example returns the lowest (minimum) tax rate.
USE AdventureWorks2008R2;
GO
SELECT MIN(TaxRate)
FROM Sales.SalesTaxRate;
GO
Here is the result set.
-------------------
5.00
(1 row(s) affected)

OVER Clause (Transact-SQL)


Determines the partitioning and ordering of the rowset before the associated window function is applied.
Applies to:
Ranking window functions
Aggregate window functions. For more information, see Aggregate Functions (Transact-SQL).
Topic link iconTransact-SQL Syntax Conventions
Ranking Window Functions 
< OVER_CLAUSE > :: =
    OVER ( [ PARTITION BY value_expression, ... [ n ] ]
           <ORDER BY_Clause> )Aggregate Window Functions 
< OVER_CLAUSE > :: = 
    OVER ( [ PARTITION BY value_expression, ... [ n ] ] )

PARTITION BY
Divides the result set into partitions. The window function is applied to each partition separately and computation restarts for each partition.
value_expression
Specifies the column by which the rowset produced by the corresponding FROM clause is partitioned. value_expression can only refer to columns made available by the FROM clause. value_expression cannot refer to expressions or aliases in the select list. value_expression can be a column expression, scalar subquery, scalar function, or user-defined variable.
<ORDER BY Clause>
Specifies the order to apply the ranking window function. For more information, see ORDER BY Clause (Transact-SQL).
Important note Important
When used in the context of a ranking window function, <ORDER BY Clause> can only refer to columns made available by the FROM clause. An integer cannot be specified to represent the position of the name or alias of a column in the select list. <ORDER BY Clause> cannot be used with aggregate window functions.
Window functions are defined in the ISO SQL standard. SQL Server provides ranking and aggregate window functions. A window is a user-specified set of rows. A window function computes a value for each row in a result set derived from the window.
More than one ranking or aggregate window function can be used in a single query with a single FROM clause. However, the OVER clause for each function can differ in partitioning and also ordering. The OVER clause cannot be used with the CHECKSUM aggregate function.

A. Using the OVER clause with the ROW_NUMBER function

Each ranking function, ROW_NUMBER, DENSE_RANK, RANK, NTILE uses the OVER clause. The following example shows using the OVER clause with ROW_NUMBER.
USE AdventureWorks2008R2;
GO
SELECT p.FirstName, p.LastName
    ,ROW_NUMBER() OVER(ORDER BY SalesYTD DESC) AS 'Row Number'
    ,s.SalesYTD, a.PostalCode
FROM Sales.SalesPerson s 
    INNER JOIN Person.Person p 
        ON s.BusinessEntityID = p.BusinessEntityID
    INNER JOIN Person.Address a 
        ON a.AddressID = p.BusinessEntityID
WHERE TerritoryID IS NOT NULL 
    AND SalesYTD <> 0;
GO

B. Using the OVER clause with aggregate functions

The following examples show using the OVER clause with aggregate functions. In this example, using the OVER clause is more efficient than using subqueries.
USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
    ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
    ,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'
    ,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'
    ,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'
FROM Sales.SalesOrderDetail 
WHERE SalesOrderID IN(43659,43664);
GO
Here is the result set.
SalesOrderIDProductIDOrderQtyTotalAvgCountMinMax
4365977612621216
4365977732621216
4365977812621216
4365977112621216
4365977212621216
4365977322621216
4365977412621216
4365971432621216
4365971612621216
4365970962621216
4365971222621216
4365971142621216
436647721141814
436647754141814
436647141141814
436647161141814
436647772141814
436647713141814
436647731141814
436647781141814
The following example shows using the OVER clause with an aggregate function in a calculated value.
USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
    ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
    ,CAST(1. * OrderQty / SUM(OrderQty) OVER(PARTITION BY SalesOrderID) 
        *100 AS DECIMAL(5,2))AS 'Percent by ProductID'
FROM Sales.SalesOrderDetail 
WHERE SalesOrderID IN(43659,43664);
GO
Here is the result set. Notice that the aggregates are calculated by SalesOrderID and the Percent by ProductID is calculated for each line of each SalesOrderID.
SalesOrderIDProductIDOrderQtyTotalPercent by ProductID
436597761263.85
4365977732611.54
436597781263.85
436597711263.85
436597721263.85
436597732267.69
436597741263.85
4365971432611.54
436597161263.85
4365970962623.08
436597122267.69
4365971142615.38
436647721147.14
4366477541428.57
436647141147.14
436647161147.14
4366477721414.29
4366477131421.43
436647731147.14
436647781147.14

ROWCOUNT_BIG (Transact-SQL)


Returns the number of rows affected by the last statement executed. This function operates like @@ROWCOUNT, except the return type of ROWCOUNT_BIG is bigint.
ROWCOUNT_BIG ()
bigint
Following a SELECT statement, this function returns the number of rows returned by the SELECT statement.
Following an INSERT, UPDATE, or DELETE statement, this function returns the number of rows affected by the data modification statement.
Following statements that do not return rows, such as an IF statement, this function returns 0

STDEV (Transact-SQL)

 

Returns the statistical standard deviation of all values in the specified expression. May be followed by the OVER clause.
Topic link iconTransact-SQL Syntax Conventions
STDEV ( [ ALL | DISTINCT ] expression )

ALL
Applies the function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered.
expression
Is a numeric expression. Aggregate functions and subqueries are not permitted. expression is an expression of the exact numeric or approximate numeric data type category, except for the bit data type.
If STDEV is used on all items in a SELECT statement, each value in the result set is included in the calculation. STDEV can be used with numeric columns only. Null values are ignored.
The following example returns the standard deviation for all bonus values in the SalesPerson table.
USE AdventureWorks2008R2;
GO
SELECT STDEV(Bonus)
FROM Sales.SalesPerson;
GO

STDEVP (Transact-SQL)

Returns the statistical standard deviation for the population for all values in the specified expression. May be followed by the OVER clause. Topic link iconTransact-SQL Syntax Conventions
STDEVP ( [ ALL | DISTINCT ] expression )
ALL
Applies the function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered.
expression
Is a numeric expression. Aggregate functions and subqueries are not permitted. expression is an expression of the exact numeric or approximate numeric data type category, except for the bit data type.
If STDEVP is used on all items in a SELECT statement, each value in the result set is included in the calculation. STDEVP can be used with numeric columns only. Null values are ignored.
The following example returns the standard deviation for the population for all bonus values in the SalesPerson table.
USE AdventureWorks2008R2;
GO
SELECT STDEVP(Bonus)
FROM Sales.SalesPerson;
GO

SUM (Transact-SQL)

Returns the sum of all the values, or only the DISTINCT values, in the expression. SUM can be used with numeric columns only. Null values are ignored. May be followed by the OVER Clause (Transact-SQL).
SUM ( [ ALL | DISTINCT ] expression )
ALL
Applies the aggregate function to all values. ALL is the default.
DISTINCT
Specifies that SUM return the sum of unique values.
expression
Is a constant, column, or function, and any combination of arithmetic, bitwise, and string operators. expression is an expression of the exact numeric or approximate numeric data type category, except for the bit data type. Aggregate functions and subqueries are not permitted. For more information, see Expressions (Transact-SQL).
Returns the summation of all expression values in the most precise expression data type.
Expression resultReturn type
tinyint int
smallint int
int int
bigint bigint
decimal category (p, s)decimal(38, s)
money and smallmoney categorymoney
float and real categoryfloat

A. Using SUM for aggregates and row aggregates

The following examples show the differences between aggregate functions and row aggregate functions. The first shows aggregate functions giving only the summary data, and the second shows row aggregate functions giving both the detail and summary data.
USE AdventureWorks2008R2;
GO
SELECT Color, SUM(ListPrice), SUM(StandardCost)
FROM Production.Product
WHERE Color IS NOT NULL 
    AND ListPrice != 0.00 
    AND Name LIKE 'Mountain%'
GROUP BY Color
ORDER BY Color;
GO
Here is the result set.
Color
--------------- --------------------- ---------------------
Black 27404.84 15214.9616
Silver 26462.84 14665.6792
White 19.00 6.7926

(3 row(s) affected)

USE AdventureWorks2008R2;
GO
SELECT Color, ListPrice, StandardCost
FROM Production.Product
WHERE Color IS NOT NULL 
    AND ListPrice != 0.00 
    AND Name LIKE 'Mountain%'
ORDER BY Color
COMPUTE SUM(ListPrice), SUM(StandardCost) BY Color;
GO
Here is the result set.
Color ListPrice StandardCost
--------------- --------------------- ---------------------
Black 2294.99 1251.9813
Black 2294.99 1251.9813
Black 2294.99 1251.9813
Black 1079.99 598.4354
Black 1079.99 598.4354
Black 1079.99 598.4354
Black 1079.99 598.4354
Black 3374.99 1898.0944
Black 3374.99 1898.0944
Black 3374.99 1898.0944
Black 3374.99 1898.0944
Black 539.99 294.5797
Black 539.99 294.5797
Black 539.99 294.5797
Black 539.99 294.5797
Black 539.99 294.5797

sum sum
--------------------- ---------------------
27404.84 15214.9616

Color ListPrice StandardCost
--------------- --------------------- ---------------------
Silver 2319.99 1265.6195
Silver 2319.99 1265.6195
Silver 2319.99 1265.6195
Silver 3399.99 1912.1544
Silver 3399.99 1912.1544
Silver 3399.99 1912.1544
Silver 3399.99 1912.1544
Silver 769.49 419.7784
Silver 769.49 419.7784
Silver 769.49 419.7784
Silver 769.49 419.7784
Silver 564.99 308.2179
Silver 564.99 308.2179
Silver 564.99 308.2179
Silver 564.99 308.2179
Silver 564.99 308.2179

sum sum
--------------------- ---------------------
26462.84 14665.6792

Color ListPrice StandardCost
--------------- --------------------- ---------------------
White 9.50 3.3963
White 9.50 3.3963

sum sum
--------------------- ---------------------
19.00 6.7926

(37 row(s) affected)

B. Calculating group totals with more than one column

The following example calculates the sum of the ListPrice and StandardCost for each color listed in the Product table.
USE AdventureWorks2008R2;
GO
SELECT Color, SUM(ListPrice), SUM(StandardCost)
FROM Production.Product
GROUP BY Color
ORDER BY Color;
GO

Here is the result set.
Color
--------------- --------------------- ---------------------
NULL 4182.32 2238.4755
Black 67436.26 38636.5002
Blue 24015.66 14746.1464
Grey 125.00 51.5625
Multi 478.92 272.2542
Red 53274.10 32610.7661
Silver 36563.13 20060.0483
Silver/Black 448.13 198.97
White 36.98 13.5172
Yellow 34527.29 21507.6521

(10 row(s) affected)

VAR (Transact-SQL)


Returns the statistical variance of all values in the specified expression. May be followed by the OVER clause.
VAR ( [ ALL | DISTINCT ] expression )

ALL
Applies the function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered.
expression
Is an expression of the exact numeric or approximate numeric data type category, except for the bit data type. Aggregate functions and subqueries are not permitted.
If VAR is used on all items in a SELECT statement, each value in the result set is included in the calculation. VAR can be used with numeric columns only. Null values are ignored.
The following example returns the variance for all bonus values in the SalesPerson table.
USE AdventureWorks2008R2;
GO
SELECT VAR(Bonus)
FROM Sales.SalesPerson;
GO

VARP (Transact-SQL)

Returns the statistical variance for the population for all values in the specified expression. May be followed by the OVER clause.
VARP ( [ ALL | DISTINCT ] expression )
ALL
Applies the function to all values. ALL is the default.
DISTINCT
Specifies that each unique value is considered.
expression
Is an expression of the exact numeric or approximate numeric data type category, except for the bit data type. Aggregate functions and subqueries are not permitted.
If VARP is used on all items in a SELECT statement, each value in the result set is included in the calculation. VARP can be used with numeric columns only. Null values are ignored.
The following example returns the variance for the population for all bonus values in the SalesPerson table.
USE AdventureWorks2008R2;
GO
SELECT VARP(Bonus)
FROM Sales.SalesPerson;
GO

Ranking Functions (Transact-SQL)

Ranking functions return a ranking value for each row in a partition. Depending on the function that is used, some rows might receive the same value as other rows. Ranking functions are nondeterministic. Transact-SQL provides the following ranking functions:
The following shows the four ranking functions used in the same query. For function specific examples, see each ranking function.
USE AdventureWorks2008R2;
GO
SELECT p.FirstName, p.LastName
    ,ROW_NUMBER() OVER (ORDER BY a.PostalCode) AS 'Row Number'
    ,RANK() OVER (ORDER BY a.PostalCode) AS 'Rank'
    ,DENSE_RANK() OVER (ORDER BY a.PostalCode) AS 'Dense Rank'
    ,NTILE(4) OVER (ORDER BY a.PostalCode) AS 'Quartile'
    ,s.SalesYTD, a.PostalCode
FROM Sales.SalesPerson s 
    INNER JOIN Person.Person p 
        ON s.BusinessEntityID = p.BusinessEntityID
    INNER JOIN Person.Address a 
        ON a.AddressID = p.BusinessEntityID
WHERE TerritoryID IS NOT NULL 
    AND SalesYTD <> 0;
Here is the result set.
FirstNameLastNameRow NumberRankDense RankQuartileSalesYTDPostalCode
MichaelBlythe11114557045.045998027
LindaMitchell21115200475.231398027
JillianCarson31113857163.633298027
GarrettVargas41111764938.985998027
TsviReiter51122811012.715198027
ShuIto66223018725.485898055
JoséSaraiva76223189356.246598055
DavidCampbell86233587378.425798055
TeteMensa-Annan96231931620.183598055
LynnTsoflias106231758385.92698055
RachelValdez116242241204.042498055
JaePak126245015682.375298055
RanjitVarkey Chudukatil136243827950.23898055

RANK (Transact-SQL)

SQL Server 2008 R2
 
Returns the rank of each row within the partition of a result set. The rank of a row is one plus the number of ranks that come before the row in question. Topic link iconTransact-SQL Syntax Conventions
RANK ()    OVER ( [ < partition_by_clause > ] < order_by_clause > )
< partition_by_clause >
Divides the result set produced by the FROM clause into partitions to which the RANK function is applied. For the PARTITION BY syntax, see OVER Clause (Transact-SQL).
< order_by_clause >
Determines the order in which the RANK values are applied to the rows in a partition. For more information, see ORDER BY Clause (Transact-SQL). An integer cannot represent a column when the < order_by_clause > is used in a ranking function.
bigint
If two or more rows tie for a rank, each tied rows receives the same rank. For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number three, because there are two rows that are ranked higher. Therefore, the RANK function does not always return consecutive integers.
The sort order that is used for the whole query determines the order in which the rows appear in a result set.
The following example ranks the products in inventory according to their quantities. The rowset is partitioned by LocationID and sorted by Quantity. Notice that the ORDER BY in the OVER clause orders the RANK and the ORDER BY of the SELECT statement orders the result set.
USE AdventureWorks2008R2;
GO
SELECT i.ProductID, p.Name, i.LocationID, i.Quantity
    ,RANK() OVER 
    (PARTITION BY i.LocationID ORDER BY i.Quantity DESC) AS 'RANK'
FROM Production.ProductInventory i 
    INNER JOIN Production.Product p 
        ON i.ProductID = p.ProductID
ORDER BY p.Name;
GO

DENSE_RANK (Transact-SQL)

 
Returns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question. Topic link iconTransact-SQL Syntax Conventions
DENSE_RANK ()    OVER ( [ <partition_by_clause> ] < order_by_clause > )
<partition_by_clause>
Divides the result set produced by the FROM clause into partitions to which the DENSE_RANK function is applied. For the PARTITION BY syntax, see OVER Clause (Transact-SQL).
<order_by_clause>
Determines the order in which the DENSE_RANK values are applied to the rows in a partition. An integer cannot represent a column in the <order_by_clause> that is used in a ranking function.
bigint
If two or more rows tie for a rank in the same partition, each tied rows receives the same rank. For example, if the two top salespeople have the same SalesYTD value, they are both ranked one. The salesperson with the next highest SalesYTD is ranked number two. This is one more than the number of distinct rows that come before this row. Therefore, the numbers returned by the DENSE_RANK function do not have gaps and always have consecutive ranks.
The sort order used for the whole query determines the order in which the rows appear in a result. This implies that a row ranked number one does not have to be the first row in the partition.
The following example returns the DENSE_RANK of the quantity of products at the various locations. Notice that the ORDER BY in the OVER clause orders the DENSE_RANK and the ORDER BY of the SELECT statement orders the result set.
USE AdventureWorks2008R2;
GO
SELECT i.ProductID, p.Name, i.LocationID, i.Quantity, 
    DENSE_RANK() OVER (PARTITION BY i.LocationID ORDER BY i.Quantity)     AS DENSE_RANK
FROM Production.ProductInventory i 
    INNER JOIN Production.Product p 
        ON i.ProductID = p.ProductID
ORDER BY Name;
GO

No comments:

Post a Comment