This script gives a list of all the columns in the database and the number of distinct values in them. This can be very useful if you take over a database and you want to find out if the columns are actually used. It might also come in handy if you need some information about the cardinality of your columns to help you design indexes. It also gives information about the largest size of data in a column and the actual size of the column, so you can find out if those text columns are really necessary for example. Although the script includes a few performance optimisations, it is not advisable to run it against a production database, as it will perform a table scan on all the tables multiple times. Jacco DECLARE @table sysname DECLARE @column sysname DECLARE @datatype sysname DECLARE @designed_length int DECLARE @all_count int DECLARE @sql nvarchar(4000) SET NOCOUNT ON EXEC sp_updatestats SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED -- Will speed things up a bit CREATE TABLE #table_info (table_name sysname NOT NULL ,column_name sysname NOT NULL ,data_type sysname NOT NULL ,designed_length int NULL ,max_length int NULL ,distinct_count int NULL ,all_count int NOT NULL ,cardinality AS CASE WHEN distinct_count IS NULL THEN CAST(data_type AS varchar(7)) WHEN all_count = 0 THEN CAST('No rows' AS varchar(7)) ELSE CAST( CAST(CAST(distinct_count AS decimal)/CAST(all_count AS decimal) AS decimal(18,4)) AS varchar(7)) END ) DECLARE c CURSOR FAST_FORWARD FOR SELECT isc.table_name, isc.column_name, isc.data_type, COALESCE(isc.character_maximum_length, isc.numeric_precision), si.rowcnt FROM information_schema.columns isc INNER JOIN information_schema.tables ist ON isc.table_name = ist.table_name INNER JOIN sysindexes si ON isc.table_name = OBJECT_NAME(si.id) WHERE ist.table_type = 'base table' AND ist.table_name not like 'dt%' AND si.indid IN (0,1) ORDER BY isc.table_name, isc.column_name OPEN c FETCH NEXT FROM c INTO @table, @column, @datatype, @designed_length, @all_count WHILE @@FETCH_STATUS = 0 BEGIN IF @datatype IN ('text', 'ntext', 'image') BEGIN SET @sql = 'SELECT ''' + @table + ''', ''' + @column + ''', ''' + @datatype + '''' SET @sql = @sql + ', ' + CAST(@designed_length AS varchar(10)) + ', MAX(DATALENGTH([' + @column + ']))' SET @sql = @sql + ', NULL' + ', ' + CAST(@all_count AS varchar(10)) + ' FROM [' + @table + ']' END ELSE BEGIN SET @sql = 'SELECT ''' + @table + ''', ''' + @column + ''', ''' + @datatype + '''' SET @sql = @sql + ', ' + CAST(@designed_length AS varchar(10)) + ', MAX(LEN(CAST([' + @column + '] AS VARCHAR(8000))))' SET @sql = @sql + ', COUNT(DISTINCT [' + @column + '])' SET @sql = @sql + ', ' + CAST(@all_count AS varchar(10)) + ' FROM [' + @table + ']' END PRINT @sql INSERT INTO #table_info (table_name, column_name, data_type, designed_length, max_length, distinct_count, all_count) EXEC(@sql) FETCH NEXT FROM c INTO @table, @column, @datatype, @designed_length, @all_count END CLOSE c DEALLOCATE c SELECT table_name, column_name, data_type, designed_length, max_length, distinct_count, all_count, cardinality FROM #table_info SET TRANSACTION ISOLATION LEVEL READ COMMITTED DROP TABLE #table_info
Agree! <br />If I may, a very small improvement to reduce number or rows in cursor<br /><pre><br />SELECT isc.table_name, isc.column_name, isc.data_type<br />FROM information_schema.columns isc<br /> INNER JOIN information_schema.tables ist<br /> ON isc.table_name = ist.table_name<br />WHERE OBJECTPROPERTY(OBJECT_ID(ist.table_name), N'IsUserTable') = 1 <br />AND OBJECTPROPERTY(OBJECT_ID(ist.table_name), N'IsMSShipped') = 0<br /><b>AND DATA_TYPE NOT IN ('text', 'ntext', 'image')</b><br />ORDER BY isc.table_name, isc.column_name<br /></pre><br /><br />You could also get the number or rows in a table (joining sysindexes), but that depends on update statistics. <img src='/community/emoticons/emotion-1.gif' alt='' /><br /><br />Bambola.<br /><br /><br />
Thanks for the compliments <img src='/community/emoticons/emotion-1.gif' alt='' /><br /><br />I have included Bambola's tip in the script (and made sure that it updates the statistics), included the BLOB datatypes, and added information about the length of the data that is actually in the columns, compared to the length that is defined for them.<br />