{"id":7342,"date":"2017-02-08T14:43:53","date_gmt":"2017-02-08T16:43:53","guid":{"rendered":"http:\/\/www.fabriciolima.net\/?p=7342"},"modified":"2017-02-08T15:12:29","modified_gmt":"2017-02-08T17:12:29","slug":"improve-the-query-performance-with-like-string-changing-only-the-collation","status":"publish","type":"post","link":"https:\/\/fabriciolima.net\/blog\/2017\/02\/08\/improve-the-query-performance-with-like-string-changing-only-the-collation\/","title":{"rendered":"Improve the performance of a query that uses &#8221; like &#8216;%String%&#8217; &#8221; changing only the collation"},"content":{"rendered":"<p>Hi Folks,<\/p>\n<p>In this blog post, I will give you a very useful tip to improve the performance of a query that uses <em>&#8221; like &#8216;%String%&#8217; &#8220;<\/em>.<\/p>\n<p>How often do you have to use a query like this?<\/p>\n<pre class=\"lang:tsql decode:true \">Select Columns,...\r\nfrom Table\r\nwhere Name like '%String%'<\/pre>\n<p>Reading the great book <a href=\"https:\/\/www.amazon.com\/Microsoft-Server-Internals-Developer-Reference\/dp\/0735658560\" target=\"_blank\"><em>SQL Server 2012 Internals<\/em><\/a>, on page 230, we have:<\/p>\n<p><em>&#8220;Another case where the collation can make a huge difference is when SQL Server has to look at almost all characters in the strings. For instance, look at the following:<\/em><\/p>\n<p><em>SELECT COUNT(*) FROM tbl WHERE longcol like &#8216;%abc%&#8217;<\/em><\/p>\n<p><em>This might execute 10 time faster or more with a binary collation than a nonbinary Windows colation. Also, with varchar data, this executes up to seven or eight times faster with a SQL collation than with a Windows collation. If you have a varchar column, you can speed this up by forcing the collation as follows:<\/em><\/p>\n<p><em>SELECT COUNT(*) from tbl<\/em><br \/>\n<em> WHERE longcol COLLATE SQL_Latin1_Genneral_CP_CI_AS LIKE &#8216;%abc%&#8217; &#8220;<\/em><\/p>\n<p>That&#8217;s a very important information. Let&#8217;s do some tests and see what&#8217;s happen.<\/p>\n<p>I will create a table with a varchar column and populate it with a lot of data.<\/p>\n<pre class=\"lang:tsql decode:true \">CREATE table dbo.Test_Collation_SQL (\r\nId_Table int identity(1,1) PRIMARY KEY,\r\nDt_Log datetime,\r\nDs_Name varchar(50)\r\n)\r\n\r\n-- Generate a lot of data. Can take a while...\r\nINSERT INTO dbo.Test_Collation_SQL\r\nselect getdate(), REPLICATE('A',50)\r\nGO 10\r\n\r\nINSERT INTO dbo.Test_Collation_SQL(Dt_Log,Ds_Name)\r\nSELECT Dt_Log,Ds_Name\r\nFROM Test_Collation_SQL\r\nGO 22<\/pre>\n<p>Now, I will insert two rows with my name to search later:<\/p>\n<pre class=\"lang:tsql decode:true \">insert into dbo.Teste_Collation_SQL\r\nselect getdate(), 'Fabricio Lima 1'\r\n\r\ninsert into dbo.Teste_Collation_SQL\r\nselect getdate(), '- Fabricio Lima 2'<\/pre>\n<p>Then I will create a index to try do help my query, just like everyone do:<\/p>\n<pre class=\"lang:tsql decode:true \">CREATE NONCLUSTERED INDEX SK01_Test_Collation_SQL ON Test_Collation_SQL(Ds_Name)  WITH(FILLFACTOR=90)<\/pre>\n<p>The size of the table is:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-7368 aligncenter\" src=\"https:\/\/www.fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTable.png\" alt=\"\" width=\"465\" height=\"134\" srcset=\"https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTable.png 514w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTable-300x86.png 300w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTable-410x118.png 410w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTable-100x29.png 100w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTable-275x79.png 275w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTable-20x6.png 20w\" sizes=\"auto, (max-width: 465px) 100vw, 465px\" \/><\/p>\n<p>Now I will enable this option to compare the results<\/p>\n<pre class=\"lang:tsql decode:true \">SET STATISTICS IO ON\r\nSET STATISTICS TIME ON<\/pre>\n<p>I also enable the Actual Execution Plan to see what&#8217;s SQL doing: CTRL+M<\/p>\n<p>My instance has this Windows Collation: Latin1_General_CI_AI<\/p>\n<p>My Database and Column also has the same Collation: Latin1_General_CI_AI<\/p>\n<h6><strong>Test 1: Using my default Windows Collation: Latin1_General_CI_AI <\/strong><\/h6>\n<pre class=\"lang:tsql decode:true \">SELECT COUNT(*)\r\nFROM Test_Collation_SQL\r\nWHERE Ds_Name LIKE '%Fabricio%'<\/pre>\n<p>The results \u00e9 <strong>two rows<\/strong> and this query took<strong> 45 secods<\/strong> to finish. It&#8217;s not cache problems because I&#8217;ve executed it three times.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-7371 aligncenter\" src=\"https:\/\/www.fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore.png\" alt=\"\" width=\"713\" height=\"293\" srcset=\"https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore.png 832w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore-300x123.png 300w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore-768x316.png 768w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore-700x288.png 700w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore-410x169.png 410w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore-100x41.png 100w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore-275x113.png 275w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeBefore-20x8.png 20w\" sizes=\"auto, (max-width: 713px) 100vw, 713px\" \/><\/p>\n<p><strong>Reads<\/strong><br \/>\nTable &#8216;Test_Collation_SQL&#8217;. Scan count 9, <strong>logical reads<\/strong> <strong>373572<\/strong><\/p>\n<p><strong>CPU Time<\/strong><br \/>\nSQL Server Execution Times:<br \/>\n<strong>CPU time =<\/strong> <strong>348922 ms<\/strong>,\u00a0 elapsed time = 45679 ms.<\/p>\n<h6><\/h6>\n<h6><strong> Teste 2: Forcing a SQL Server collation and searching the same thing<\/strong><\/h6>\n<pre class=\"lang:tsql decode:true \">SELECT COUNT(*)\r\nFROM TreinamentoDBA..Teste_Collation_SQL\r\nWHERE Descri\u00e7\u00e3o COLLATE SQL_Latin1_General_CP1_CI_AI LIKE '%Fabricio%'<\/pre>\n<p>The results is <strong>two rows<\/strong> and this query took only <strong>5 secods<\/strong>!!! <strong>Amazing!!! Unbelievable!!! It&#8217;s like magic!!<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-7370 aligncenter\" src=\"https:\/\/www.fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter.png\" alt=\"\" width=\"778\" height=\"257\" srcset=\"https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter.png 1014w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter-300x99.png 300w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter-768x254.png 768w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter-700x231.png 700w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter-410x135.png 410w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter-100x33.png 100w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter-275x91.png 275w, https:\/\/fabriciolima.net\/blog\/wp-content\/uploads\/2017\/02\/PostCollationTimeAfter-20x7.png 20w\" sizes=\"auto, (max-width: 778px) 100vw, 778px\" \/><\/p>\n<p><strong>Reads<\/strong><br \/>\nTable &#8216;Test_Collation_SQL&#8217;. Scan count 9, <strong>logical reads 374622<\/strong><\/p>\n<p><strong>CPU Time<\/strong><br \/>\nSQL Server Execution Times:<br \/>\n<strong>CPU time =<\/strong> <strong>37265 ms<\/strong>,\u00a0 elapsed time = 5300 ms.<\/p>\n<p>The logical reads is almost the same, but we have a huge difference in the CPU Time.<\/p>\n<p>The two queries have the same execution Plan doing a Scan on the index SK01_Test_Collation_SQL and working on parallel.<\/p>\n<h6><strong>Results<\/strong><\/h6>\n<p>Windows Collation: 45 seconds and 348922 ms of CPU time<\/p>\n<p>SQL Collation: 5 seconds and\u00a0 37265 ms of CPU time<\/p>\n<p><span style=\"color: #ff0000;\"><strong>Unbelievable!!!<\/strong><\/span><\/p>\n<p>Why does it happen?<\/p>\n<p>This article help us to understand:<\/p>\n<p><a href=\"https:\/\/support.microsoft.com\/en-us\/help\/322112\/comparing-sql-collations-to-windows-collations\" target=\"_blank\">https:\/\/support.microsoft.com\/en-us\/help\/322112\/comparing-sql-collations-to-windows-collations<\/a><\/p>\n<p>&#8220;<em>If you are storing and handling your data by using non-Unicode data types (<span class=\"text-base\">char<\/span>, <span class=\"text-base\">varchar<\/span>, <span class=\"text-base\">text<\/span>), and you are using a <strong>SQL collation<\/strong>, string comparisons will be performed with a <strong>non-Unicode SQL sort order<\/strong>.<\/em><\/p>\n<p><em>If you are storing and handling your data by using non-Unicode data types (<span class=\"text-base\">char<\/span>, <span class=\"text-base\">varchar<\/span>, <span class=\"text-base\">text<\/span>), and you are using a <strong>Windows collation<\/strong>, string comparisons will be performed with the <strong>Unicode sorting rules<\/strong>. <strong>This may cause<\/strong> <strong>certain operations that are unusually dependent on string sorting performance to <span style=\"color: #ff0000;\">take longer and to use more CPU<\/span> than a similar operation that is performed with a SQL collation.<\/strong><\/em>&#8221;<\/p>\n<p><strong>I did another test with more data and the results are:<\/strong><\/p>\n<p><strong>Windows Collation:<\/strong> 1 minute and 48 seconds<\/p>\n<p><strong>SQL Collation:<\/strong> 12 seconds<\/p>\n<p><span style=\"color: #ff0000;\">It&#8217;s like a magic! A huge difference!!<br \/>\n<\/span><\/p>\n<p>Enjoy and share with your friends&#8230;<\/p>\n<p>Thanks to my friend Edvaldo (<a href=\"http:\/\/edvaldocastro.com\/\" target=\"_blank\">Blog<\/a>) for the revision of the text.<\/p>\n<p>Fabr\u00edcio Fran\u00e7a Lima<\/p>\n<p>MCITP \u2013 Database Administrator<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hi Folks, In this blog post, I will give you a very useful tip to improve the performance of a query that uses &#8221; like &#8216;%String%&#8217; &#8220;. How often do you have to use a query like this? Select Columns,&#8230; from Table where Name like &#8216;%String%&#8217; Reading the great book SQL Server 2012 Internals, on [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[33],"tags":[1459,475,1456,1453,1434,1458,1457,1450,1452,1451,1455,33,1454],"class_list":["post-7342","post","type-post","status-publish","format-standard","hentry","category-sql-server","tag-collate-sql_latin1_general_cp1_ci_ai","tag-collation","tag-collation-bin","tag-collation-performance","tag-collation-sql-server","tag-latin1_general","tag-latin1_general_ci_ai","tag-like","tag-performance-problem","tag-performance-tuning","tag-sql-collation","tag-sql-server","tag-windows-collation"],"post_mailing_queue_ids":[],"_links":{"self":[{"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/posts\/7342","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/comments?post=7342"}],"version-history":[{"count":0,"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/posts\/7342\/revisions"}],"wp:attachment":[{"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/media?parent=7342"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/categories?post=7342"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fabriciolima.net\/blog\/wp-json\/wp\/v2\/tags?post=7342"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}