Thursday, July 17, 2008

Determine attribute utilization for CRM entities

Upgrading Dynamics CRM 3.0 systems to 4.0 has kept me very busy lately. I've been involved with four upgrades in the past few months and have more on the horizon I'm sure.

On one of the upgrade projects, the client's previous IT staff built-up the Account entity to include about 300 attributes. Now, I'm not one to quickly judge this as a bad idea without getting the facts and rationale for having so many custom attributes. Although I wouldn't have designed the entity the same way (I would've used a few picklists to replace the 100+ bit fields) the current design was built, was working, and it wasn't in the budget to change that part of the system.

But the question I had was this, "What percentage of each Account attribute has a value (is not null)?". Knowing the answer to this, and knowing what types of clients, partners, suppliers, etc. were stored in the Account entity, might lead to some decisions on whether all of the attributes are really necessary or whether a particular company type might be best spun-off to a new entity.

At the bottom of this posting I provide the T-SQL script I put together to answer the "percentage of attribute utilization" question.

After running the script, I found one classification of companies that accounted for 80% of the 150,000 Account records. But that company type only stored data in 10% of the 300 attributes! In other words, there were approximately 35,000,000 fields in SQL Server with NULL values for those Account records!

Conversely, the other types of companies stored in the Account entity utilized 90% of all account attributes. So, most account attributes were being populated, but not for a very large percentage of account records.

The problem with having unpopulated attributes is not so much the unnecessary database overhead but the complexity of the Account form when users only need to fill-in 25 or so fields for one of the records types but need to populate up to 300 fields for other account types. That leads to a lot of confusion about what data is required in the UI. Building reports, too, becomes much more difficult with so many attributes.

Here's the T-SQL script I created for this attribute utilization analysis. Again, this will help you determine what percentage of each attribute in the Account entity has a non-null value. Of course, you can tweak this code to analyze any other entity or even all entities and attributes!

set nocount on
declare @table varchar(255)
declare @col varchar(255)
declare @datatype varchar(255)
declare @sql varchar(1000)
declare @total_recs int
declare @total_non_null_recs int

declare cur cursor for
select table_name, column_name, data_type
from information_schema.columns
where table_name = 'FilteredAccount' -- tweak this
order by [table_name], [ordinal_position]

open cur
fetch next from cur into @table, @col, @datatype
while (@@fetch_status > -1)
begin
set @sql = 'declare @total_recs float; select @total_recs = count(*) from ' + @table + ';'
set @sql = @sql + 'declare @total_non_null_recs float;'
set @sql = @sql + 'select @total_non_null_recs = count(*) from ' + @table + ' where [' + @col + '] is not null;'
set @sql = @sql + 'declare @percent_data_availability int;if @total_recs > 0 begin select @percent_data_availability = round((@total_non_null_recs / @total_recs) * 100, 0) end else begin select @percent_data_availability = 0 end;'
set @sql = @sql + 'print ' + '''' + @table + ',' + @col + ',' + '''' + ' + convert(varchar(10),@percent_data_availability)'
execute (@sql)
fetch next from cur into @table, @col, @datatype
end
close cur
deallocate cur



The output of this script appears, in SQL Server Management Studio, on the Messages tab in the query results area of the query window.

The ouput includes the table name, attribute name, and an integer representing the percentage of records where the attribute has a non-null value. For example, the output below means that 78% of the address1_city attributes in all records in the FilteredAccount view has a value:

FilteredAccount,address1_city,78

In conclusion, you can use this script or something like it to quickly determine which entities and attributes in your CRM system has a value. If you see a low percentage for an attribute that you think should be much higher then that will likely lead you to take action, such as requiring values in the UI or perhaps removing the attribute altogether.

No comments:

Post a Comment