June 2007 - Posts

Stuff I've been intending to post a meaningful post about, but haven't:

If you have ever wondered what ildasm is all about, here's a great link.
http://msdn.microsoft.com/msdnmag/issues/01/05/bugslayer/

Simple SQLCLR stored Proc deployment Walthrough:
http://blogs.msdn.com/vsdata/archive/2004/12/14/300216.aspx

Three sweet articles about Threads, System.Threading.Thread, and !threads (sos.dll)
http://blogs.msdn.com/yunjin/archive/2005/08/25/456355.aspx
http://blogs.msdn.com/yunjin/archive/2005/08/29/457150.aspx
http://blogs.msdn.com/yunjin/archive/2005/08/30/457756.aspx

A couple of sites that led me to err.exe
http://blogs.technet.com/brad_rutkowski/archive/2007/03/29/the-case-of-sidebar-exe-not-starting-oh-snap.aspx
http://blogs.technet.com/brad_rutkowski/archive/2006/09/18/to-err-is-admin.aspx

some unit testing stuff i wanted to read
http://www.codeproject.com/cs/database/UnitTestDbAppsWithNDbUnit.asp
http://msdn.microsoft.com/msdnmag/issues/05/03/TestRun/
http://msdn.microsoft.com/msdnmag/issues/06/01/UnitTesting/

and, finally, a great article on troubleshooting memory management issues with .net
http://msdn.microsoft.com/msdnmag/issues/06/11/CLRInsideOut/

 

I'm tagging this post with 'tips' if for nothing else than err.exe =) 

These are the articles (in no particular order) that I felt best showed a thorough use of the WinDbg.exe tool from start to finish. They were absolutely priceless to me. Enjoy!

+ 1 Troubleshooting ASP.NET using WinDbg and the SOS extension

 

=)

5 things I wish I had known or done prior to attempting to work with SOS.dll and windbg.exe: 

  1. sos.dll needs to be in the path for windbg in order to load it
  2. you can .load %full path to sos.dll% instead of .load sos mscorwks or other statements
  3. SOS for .NET 2.0 does *NOT* have all the commands the .NET 1.x version does (a source)
  4. where these two articles were: one, two.

I experienced an extreme amount of pain first working with SOS.dll because of some of the problems above. I also found that there were LOTS of articles about using it, but some were more detailed then others about carrying through to the end of the process. i found many articles along the way that utilized a variety of different tools. i think the links above were the most useful in shedding immediate light on sos.dll and how it is used in the debugging process.

something cool i've never heard of. quoted from http://blogs.msdn.com/dougste/ about a path value of \\?\C:\WEBSites\WWWMyApp\scripts:

First of all, what is illegal about this path? Well, nothing, if you are a Unicode Win32 API. As you can read in Naming a File on MSDN, certain Unicode Win32 file handling APIs allow a path to be prefixed with \\?\ which allow paths to be up to 32,000 characters in length among other things. It also tells the operating system to not canonicalize the path by interpreting things such as .. to mean 'go to the parent directory'. Unfortunately not all parts of the System.IO namespace in .NET have yet caught up with this reality and still consider ? in a path to be illegal.

since i've had it in mind to write this for *so* long, i will at least write something, and maybe someday i'll come back and clean it up. =P

i was talking to a friend of mine quite some time back about what i call the 'triangle theory' of resource management at most companies. the idea is that most companies distributed resources in a triangular fashion. the top (narrow) part of the triangle is where the smallest amount of resources go. the top part also is where the most skilled people reside and is represented by the small area taken up by the skilled people. the bottom is much fatter and is where there are ultimately more resources. however, they are a lot more people there. the unskilled people. perhaps, i should say less skilled or just 'skilled people in training' =)

the reasons this works well as a visual is that when you have a triangle, it actually shows that your top tier only has room for so much. a very small amount. so as your workers improve and start working towards the top of the triangle, eventually, there isn't enough room and they must either be pushed out of the triangle all together (leave/fired/whatever) or they must push someone else out.

my take is that you should utilize what i have termed the 'upside triangle method' to distribute resources. instead of investing money in that way (ultimately, less money in more skilled people) turn the triangle over. invest the majority of money in the highly skilled people. allow them to work collaboratively and have plenty of room at the top. minimize the people with very little skill and allow plenty of room for them to grow. this also gives the less skilled people more role models and allows them to grow even faster.


so instead of this

1 super senior - 3 senior - 6 mid level - 12 jr

100 - 240 - 360 - 600


invest in something like

6 super senior - 4 senior - 2 mid level - 1 jr

600 - 320 - 120 - 50


i'm using rough figures and ratios here, but let's say the numbers are thousands. so super senior are getting 100k, senior are getting 80k, mid level are getting 60k and junior are getting 50k. these numbers, in my opinion, are not high enough to reflect the current IT market, but let's say the ratio is close. if you add these up, you can see you save like 300k just by going to the upside triangle model. in addition, you have almost half the staff. this is awesome! for some reason, most managers associated headcount with success rather than actual results. it's better in reality to get more done with less always. if you don't realize that, you are not doing your job correctly and are practicing the "Get Me Promoted Methodology" (GMPM) rather than a results driven approach.

using less people generally leads to less misunderstandings, a more clean solution, and communication overhead in the group (especially a size difference like this) drops quite a bit.

well, that's all for now. i want to really clean this up sometime and put some pictures in here etc from the whiteboard, but it's been hanging out in my 'go blog about this' queue for a couple/few months now so it's time to put it out there at least. then i can revisit it. =)

I found this while in my surfing and thought it was a noteworthy point on .net multi-threading.

quoted from http://msdn2.microsoft.com/en-us/library/ms998547.aspx (underlining is mine):

The CLR exposes managed threads, which are distinct from Microsoft Win32® threads. The logical thread is the managed representation of a thread, and the physical thread is the Win32 thread that actually executes code. You cannot guarantee that there will be a one-to-one correspondence between a managed thread and a Win32 thread.

If you create a managed thread object and then do not start it by calling its Start method, a new Win32 thread is not created. When a managed thread is terminated or it completes, the underlying Win32 thread is destroyed. The managed representation (the Thread object) is cleaned up only during garbage collection some indeterminate time later.

The .NET Framework class library provides the ProcessThread class as the representation of a Win32 thread and the System.Threading.Thread class as the representation of a managed thread.

Poorly-written multithreaded code can lead to numerous problems including deadlocks, race conditions, thread starvation, and thread affinity. All of these issues can negatively impact application performance, scalability, resilience, and correctness.

i feel like i have seen a ton of places that say for and foreach are the same performance-wise in .net. i never really worried about it much. i have historically done a lot of refactoring and loops are definitely one area you can see lots of improvement in. so i figured i would post some actual documentation about this. =)

 quoted from http://msdn2.microsoft.com/en-us/library/ms998574.aspx:

Using foreach can result in extra overhead because of the way enumeration is implemented in .NET Framework collections. .NET Framework 1.1 collections provide an enumerator for the foreach statement to use by overriding the IEnumerable.GetEnumerator. This approach is suboptimal because it introduces both managed heap and virtual function overhead associated with foreach on simple collection types. This can be a significant factor in performance-sensitive regions of your application. If you are developing a custom collection for your custom type, consider the following guidelines while implementing IEnumerable:

  • If you implement IEnumerable.GetEnumerator, also implement a nonvirtual GetEnumerator method. Your class's IEnumerable.GetEnumerator method should call this nonvirtual method, which should return a nested public enumerator struct.
  • Explicitly implement the IEnumerator.Current property on your enumerator struct.

For more information about implementing custom collections and about how to implement IEnumerable as efficiently as possible, see "Collection Guidelines" in Chapter 5, "Improving Managed Code Performance."

Consider using a for loop instead of foreach to increase performance for iterating through .NET Framework collections that can be indexed with an integer.

 

so, there you have it. use for to get the greatest performance. =)

This is in response to this article and is me giving dennis a hard time. =P

byte[] XmlToByte(XmlDocument d)
{
    
using (StringWriter sw = new StringWriter())
    {
        
using (XmlTextWriter xw = new XmlTextWriter(sw))
        {
            d.WriteTo(xw)
;
            
ASCIIEncoding encoding = new ASCIIEncoding();
            return 
encoding.GetBytes(sw.ToString());
        
}
    }
}

Ah. Garbage Collection... how I love and hate thee. =P

I think one sad thing about programming in .net is that it seems many developers don't know or care anything about garbage collection and memory management. You used to *have* to know about it in order to write bug free code. I suppose it is a two edge sword in that developers can develop faster as the intention was to relieve developers of the need to know or care anything about memory management. I personally have always tried to follow best practices for implementation and have tried to stay somewhat cautious of issues related to memory. I've been fortunate in never really having any memory management related issues in my previous applications. However, there, but for the grace of God, I too could go.

Recently, we have come across an issue that has required some intense scrutiny on our memory management process. Although it appears we are using mostly best practices for our access and use of certain objects (mainly in the System.Xml namespace) we are still running into memory problems. Unfortunately, our application includes some manual invocations of the GC.Collect() (EWWWWWWWW) to supposedly free up some of the memory consumed by some large objects.

Blah blah blah. The point is I have had to become a lot more familiar with garbage collection and memory management in general. I figured I would post what (I have found) to be the best sources for information on this topic, in the order I think it's best to read them. Some of these refer to each other and I am merely following the order suggested by them. Others, I am simply including the order that seemed to make the most sense to me. I am extremely confident that after you read these articles, you will have a great handle on memory management and how the garbage collector work in .net.

I'm also including, as a bonus, this article which goes very deep into resource management using dispose and finalize. I went ahead and tagged this as architecture as well since I believe any good architect should want to know about the underlying framework.
 
Enjoy!
 
[edit] I'm going to tack a +1 onto this stack because I forgot I read this one somewhere between 1 and 6 and I found it helpful. If you read the above though, it's not really necessary.

so I saw this headline the other day in one of my feeds:

Speed Up Performance And Slash Your Table Size By 90% By Using Bitwise Logic

which, of course, piqued my interest. after reading this article, i decided to do a little research. one thing to realize is that using the int datatype you can only utilize 32 choices. i'm not an expert in using base 2 for math, however i went out and tried to find some articles that would refresh my memory. the following two were my favorites:

Bitwise Operations in C

PHP Bitwise Tutorial

The second had my favorite bit (haha!) of bitwise information. This swell comic. =P

counting: bit by bit 

 

as you can see by the comments in the other guys blog, not everyone agrees with this 'clever' idea. i have to admit that it is definitely more work to ensure that your developers are familiar with bitwise calculations in general. i think it would be fun though if you have some type that could be multiple choice. one commenter raises the idea that in theory using 8 bit fields instead of 1 int field with bitwise logic should take up the same storage space. definitely an interesting way to store data. not to mention, if you aren't using sql. maybe you want to use xml or some other method of storage. you could probably even serialize some custom object. that roams a bit far away from a pure sql conversation, but.... well whatever.

 

i thought it was definitely interesting enough to share. =) 

 

 

 

create table t(id int identity primary key, v bit)
go

insert t select 1
go 10

select count(*) [before] from t

begin tran
     
truncate table t
      select count(*) [after truncate] from t
rollback
tran

select count(*) [after rollback] from t
go

drop table t
go

 

I have heard this in multiple meetings. It's not true. Please investigate your individual situations, but the idea that you can not rollback a truncate at all is incorrect. 

 Johnson, Sean [9:58 AM]:
I love these
Ashbrook, Roy [9:59 AM]:
i have seen entire procs that have like a whole extra complicated proc in them, just because they want one temp table that they think they need hehe
Ashbrook, Roy [9:59 AM]:
so they copy like the top 80% of the proc, just to get the 10% in the middle =P
Johnson, Sean [9:59 AM]:
ok you are looking at this along with me apparently... LOL!
Johnson, Sean [9:59 AM]:
same thing
Ashbrook, Roy [9:59 AM]:
yep. makes me sad haha
Johnson, Sean [9:59 AM]:
it makes you cringe to look at the rest of the procs now doesnt it
Ashbrook, Roy [10:00 AM]:
i normally shake my head with disapointment before i open them
Ashbrook, Roy [10:00 AM]:
it saves time
Ashbrook, Roy [10:00 AM]:
=P
Johnson, Sean [10:00 AM]:
lmao
Johnson, Sean [10:00 AM]:
we should blog that as a best practice
Ashbrook, Roy [10:00 AM]:
haha
Johnson, Sean [10:00 AM]:
"shake head first before opening"
Ashbrook, Roy [10:01 AM]:
we could cal them shfbo procs
Ashbrook, Roy [10:01 AM]:
pronounced shifbo!
Johnson, Sean [10:02 AM]:
lol.. new interview term has been invented
Johnson, Sean [10:02 AM]:
Hi are you familiar with shifbo procs?

-- create a mini inventory structure
-- let's say it has toys in it
declare @x xml
set @x =
'<items>
    <item id="1" type="toy">
        <name>car</name>
        <description>toy car</description>
        <price>10</price>
    </item>
    <item id="2" type="toy">
        <name>bike</name>
        <description>toy bike</description>
        <price>100</price>
    </item>
    <item id="3" type="sport">
        <name>bike</name>
        <description>real bike</description>
        <price>100</price>
    </item>
</items>'

--look at the toys
select
    
x.item.value('@id[1]','int') [id]
    , x.item.value(
'@type[1]','varchar(20)') [type]
    , x.item.value(
'name[1]''varchar(20)') [name]
    , x.item.value(
'description[1]''varchar(20)') [description]
    , x.item.value(
'price[1]''money') [price]
from
    
@x.nodes('//items/item'as x(item)

--wait, they should all be toys.
--what's that real bike doing in here?
--let's delete the non toys
set @x.modify('delete (/items/item[@type!="toy"])')

--hmm, that price is still way wrong on our toy bike, let's fix it
set @x.modify(
    
'replace value of
        (/items/item[description/text() = "toy bike"]/price/text())[1]
    with
        "10"'
)

--yay!
select
    
x.item.value('name[1]''varchar(20)') [name]
    , x.item.value(
'price[1]''money') [price]
from
    
@x.nodes('//items/item'as x(item)

datalength

from BOL - "Returns the number of bytes used to represent any expression."

the individual that asked me about this recently started using some much larger xml objects in their xml datatype column. it threw off sql because the stats weren't being updated and caused some performance issues which an update statistics and a clustered index rebuild corrected.

this also works with the xml datatype variables. so you can do a

declare @x xml
set @x = 'some xml here'
select datalength(@x)