timstall: July 2008

Thursday, July 31, 2008

Why would my program suddenly stop working?

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/why_would_my_program_suddenly_stop_working.htm]

Deterministic bugs are easy. When you write "ConvertCelsiusToFahrenheit", the debugging is simple. When things break, they're very repeatable, and it's easy to step through the debugger and see why. However, production code doesn't work this way. Sometimes your enterprise application will just temporarily stop working, only to resume working correctly again a little later. Why? Here's a few ideas:

Caching - something was cached, and the cache expired.
Session - the session expired
External dependencies - a dependent web service or database could be down
Rare boundary condition - perhaps your code doesn't account for certain rare input (like nulls, or not escaping special characters)
Concurrency - perhaps the code works great in a single thread (which is how must code is tested), but doesn't handle being run concurrently, for example one thread deadlocks, or another process locks a resource.
Too much load - perhaps too much load temporarily crashed something - like throwing an out of memory exception.
Randomness - maybe your code uses random numbers, and most of those work, but some of them don't - i.e. the code crashes when the random number is divisible by 111, or something really weird like that.
Incremental buildup with rounding error - perhaps every time the code is run, it produces an incremental buildup somewhere, like inserting a row in a database table. And as long as there are less than X rows, it "rounds down" and works. However, once the table has X+1 rows, it "rounds up" and something fails. This is abnormal, but certainly possible.

There is almost always some sufficient cause that causes the code to act abnormally. It helps for your app to have a good logger, such that you have clues to track down what that cause was. It also helps to have a QA environment that matches production, so that you can try to reproduce the steps yourself. Knowing that there will inevitably be production errors, it should encourage us to write good code upfront such that we take care of all the easy errors and these preventable bugs don't distract us from fixing the non-trivial ones.

Tuesday, July 29, 2008

Getting file and line numbers without deploying the PDB files

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/getting_file_and_line_numbers_without_deploying_the_pdb_file.htm]

Outline:

Problem
Inadequate Solutions
A Better Way
Step 1: Create the pdb2xml Database
Step 2: Query the pdb2xml Database Using the IL Offset
Step 3: Have Your Application Log the IL Offset
Download the source code and demo
Conclusion

----

Problem

Enterprise applications will inevitably throw exceptions. The ideal thing to do with these exceptions is to log them, send the results back to the appropriate developer, and then have that developer fix the code. Part of the problem is that the production (i.e. release-mode) logs are often missing helpful information that developers take for granted in debug mode, like the source file and line number.

Inadequate Solutions

One approach is to just settle and not get the extra info. For easy bugs it is sufficient to just have the callstack (from calling the exception’s ToString method) and some extra arbitrary parameters from a custom logger. But what about the bugs that aren't easy?

Another approach is to just dump the PDB files into your production environment. If you put the PDB (program database) files right next to the corresponding DLLs, then .Net will automatically generate the file and line numbers for you in every exception callstack. Recall that the PDB files contain information to reverse-engineer your code, such that a debugger could step through it. So you almost never want to give those files out to the public. That means this approach only works for applications you host, such as an ASP.Net app. But even still, it could be a security risk, or your IT department or deployment team may have a policy against it.

A Better Way

Looking at these inadequate solutions, it makes us appreciate the ideal solution, which would full the two criteria:

Provide you the PDB info like file name and line number
Not require you to ship or deploy the PDB files.

.Net allows you to do this. The “trick” is to get the “IL Offset” and use that to lookup in the PDB files for the exact info you need. Recall that all .Net code gets compiled to Intermediate Language (IL); therefore the IL Offset is just the line number in the IL code. The PDB maps the IL code to your source code. So, this approach has three main steps:

Create the pdb2xml database, this maps your IL to your source code.
Query the pdb2xml database using the IL Offset.
Have your application log the IL Offset.

This approach fulfills our criteria, so let’s explore it in more detail.

Step 1: Create the pdb2xml Database

The PDB files are not plain text, so they’re hard to work with for most people. However, the MSDN debugging team wrote a free tool that lets you convert the PDB files to XML (special thanks to Mike Stall for helping me better understand Pdb2Xml), such that you can easily look up info in them. You can download the “pdb2xml” converter tool from MSDN: http://www.microsoft.com/downloads/details.aspx?familyid=38449a42-6b7a-4e28-80ce-c55645ab1310&displaylang=en

When running the pdb2xml tool, it creates an xml file like so:

<symbols file="TestLoggerApp.Core.dll">
<files>
    <file id="1" name="C:\Temp\MyApp.Core\Class1.cs" ... />
    <file id="2" name="C:\Temp\MyApp.Core\Class2.cs" ... />
files>
<methods>
    <method name="MyApp.Core.Class3.Start3" token="0x6000002">
      <sequencepoints total="1">
        <entry il_offset="0x4" start_row="17" start_column="7"
          end_row="17" end_column="54" file_ref="1" />
      sequencepoints>
      <locals />
    method>
    …

This lists all the files, classes, and method involved. Each method can be looked up via the unique token. The node provides what we ultimately want – the coveted file and line number. We get the file by using entry.file_ref to lookup in the files section, and we get the line number from the entry.start_row attribute.

In order to get the exact node, we will need to know the specific:

Xml file to lookup at, where each xml file maps to a .Net assembly.
Method, which we can obtain from the token. The method name is just extra info for convenience.
IL Offset, which is stored as a hex value.

Because real application usually have many assemblies, ideally we could just point to a bin directory full of pdb files, and have a tool (like an automated build) dynamically generate all the corresponding xml files. We can write a wrapper for pdb2xml to do this.

The biggest issue when writing such a wrapper tool is that pdb2xml, which uses Reflection to dynamically load the assemblies, will get choked up when loading one assembly which contains a class that inherits a class in a different assembly. The easiest way to solve this is to just copy all the targeted assemblies (that you want to generate your xml files for) to the bin of pdb2xml. You could use the ReflectionOnlyAssemblyResolve event to handle resolution errors, but that will provide other problems because you need a physical file, but the event properties only give you the assembly name. While most of the time they’re the same, it will be one more problem to solve when they’re not.

Pdb2xml should handle a variety of cases – assemblies with strong names, third-party references, compiled in release mode, or files that are even several MB big.

ASP.Net applications are a little trickier. Starting with .Net 2.0, ASP allows another compilation model, where every page can get compiled to its own DLL. The easiest way to collect all these DLLs is to run the aspnet_compiler.exe tool, which outputs all the assemblies (and PDBs) to the web’s bin directory. You can read about the aspnet_compiler here: http://msdn.microsoft.com/en-us/library/ms229863(VS.80).aspx, or its MSBuild task equivalent: http://msdn.microsoft.com/en-us/library/ms164291.aspx.

Note that when using the aspnet_compiler, you need to include the debug ‘-d’ switch in order to generate the PDB files. A sample call (ignoring the line breaks) could look like:

aspnet_compiler.exe
-v /my_virtualDir
-p C:\Projects\MyWeb\
-f
-d
-fixednames C:\Projects\precompiledweb\MyWeb\

For convenience, I’ve attached a sample tool - PdbHelper.Cmd.Test (from the download) which will mass-generate these xml files for you. This solves the first step – converting the pdb files to an xml format that we can then query. You can now put those xml files anywhere you want, such as on a shared developer machine.

Step 2: Query the pdb2xml Database Using the IL Offset

Given an xml data island, we can easily query that data. The only thing we need is a key. In this case, we can have the application’s logger generate an xml snippet which some tool or process can then scan for and use it to lookup in the pdb2xml database. Let’s say that our logger gave us the following xml snippet (we’ll discuss how in the next step):

<ILExceptionData>
<Point module='TestLoggerApp.Core.dll' classFull='TestLoggerApp.Core.Class2'
methodName='Start2' methodSignature='Int32 Start2(System.String, Boolean)'
methodToken='0x6000005' ILOffset='11' />
...
ILExceptionData>

For each line in the stack trace, our this XML snippet contains a node. The node has the attributes needed to lookup in the pdb2xml database. These are the three values that we need:

module – the .Net module, which directly maps to an xml file.
methodToken – the token, which uniquely identifies the method
ILOffset – The line, in IL, that threw the exception. Our logger wrote this as decimal, but we can easily convert it to hex.

These values are just included for convenience:

classFull – the full, namespace-qualified, name of the class
methodName – the actual method’s name
methodSignature – the signature, to help troubleshoot overloaded methods

Given this xml snippet, we can have any tool or process consume it. In this case, I wrote a sample WinForm app (PdbHelper.Gui, from the Download) that takes a directory to the pdb2xml database, as well as the Xml Snippet, and performs the lookup. The logic is straightforward, perhaps the only catch is the IL Offset is not always exact; therefore if there is no exact match in the pdb2xml file, round down – i.e. find the previous entry node.

So, the developer could run this app, and it returns the file, line, and column.

While this is a manual GUI app, the logic could be automate for a console app, or other process.

Step 3: Have Your Application Log the IL Offset

The last step is to generate the Xml snippet. Given any Exception, you can use the System.Diagnostics.StackTrace object to determine the IL Offset, method, and module. You first need to create a new StackTrace object using the current Exception. You can then cycle through each StackFrame, getting the relevant data. This logic could be abstracted to its own assembly such that you could easily re-use it across all your applications.

    public static string CreateXmlLog(Exception ex)
    {
      try
      {
        //Get offset:
        System.Diagnostics.StackTrace st = new System.Diagnostics.StackTrace(ex, true);
        System.Diagnostics.StackFrame[] asf = st.GetFrames();

        StringBuilder sb = new StringBuilder();
        sb.Append("\r\n");

        int[] aint = new int[asf.Length];
        for (int i = 0; i < aint.Length; i++)
        {
          System.Diagnostics.StackFrame sf = asf[i];
          sb.Append(string.Format(" \r\n",
            sf.GetMethod().Name, sf.GetILOffset(), sf.GetMethod().Module, sf.GetMethod().ReflectedType.FullName,
            sf.GetMethod().ToString(), GetILHexLookup(sf.GetMethod().MetadataToken)));
        }

        sb.Append("\r\n");

        return sb.ToString();
      }
      catch (Exception ex2)
      {
        return "Error in creating ILExceptionData: " + ex2.ToString();
      }
    }
    private static string GetILHexLookup(int intILOffsetDec)
    {
      return "0x" + intILOffsetDec.ToString("X").ToLower();
    }

Download the source code and demo

You can download the complete source code, and an automated demo here.

The package has the following folders:

BuildScripts - automated scripts to run everything. This is useful if you wan to integrate the PdbHelper into your own processes.
mdbg - the pdb2xml application, with the compiled binaries. This was downloaded from MSDN.
PdbHelper.Cmd.Test - the command line tool to create all the xml files from pdb (this wraps the MSDN pdb2xml code)
PdbHelper.Core - reusable logic that the command line and GUI both use.
PdbHelper.Gui - the windows GUI to easily look up debugging info in the pdb-generated xml files.
PdbHelper.Logger - a reusable logger component that takes in an Exception and returns an xml snippet containing the IL offset.
TestLoggerApp - a test application to demonstrate all this.

There's not much code to all of this, so you could just reverse engineer it all. But to make it easy, go to the BuildScripts folder and you're see 4 bat files, numbered in order:

0_DeleteBins.bat - cleans up things to "reset" everything (delete bin and obj folders). This is optional, but useful when developing.
1_CompileFramework.bat - compile the PdbHelper framework (you could just open the solution in VS)
2_RunTestApp.bat - runs the test console app, whose whole purpose is to throw an exception, display the IL offset xml snippet, and then write it out to a file for easy use.
3_LookupException.bat - Run the windows GUI app. This passes in command line arguments to automatically populate the xml directory and the IL offset snippet generated in the previous step. You just need to click the "Lookup" button, and it should show you the debug info.

Several of these scripts call MSBuild to run certain tasks. Also, by default, this dumps the pdb2xml files in C:\Temp\pdb2xml.

Conclusion

Using these three steps allows an application to log additional info, from which we can then query the pdb files to find the source file and line number. This extra info can be very useful when debugging, helping to reduce the total cost of ownership.

Monday, July 28, 2008

Deploying PDBs to get the exact line number of an exception

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/deploying_pdbs_to_get_the_exact_line_number_of_an_exception.htm]

(Also check out Getting file and line numbers without deploying the PDB files).

Ideally an application in production will log all its unhandled exceptions. It would also be ideal for those error logs to have as much information as possible to help reproduce the bug. One piece of info that is very handy is knowing what line number of source code ultimately triggered the exception. For example, if you have a 50-line method that throws a null exception, you'll want to know what exact line was the culprit. If you just have the raw assemblies (only the DLL or EXE), that alone doesn't tell you the source-code line number, and for good reason - you lose all that line-number info when compiling your friendly source code (complete with comments and white space) into an assembly. That compiled assembly is essentially just a collection of bytes; it is not plain text or human-friendly.

That's why the PDB ("program database") are so great. The PDB file maps the original source code to the compiled assembly. It is the PDB that helps you step through your actual code in the debugger. So, without your PDB file, the Exception's stack trace just shows method names. But with the PDB file, it also includes "debugging" information like file name and line number.

For example, say you have the following trivial program, whose sole point is to throw an exception.

using System;

namespace ExceptionDemo
{
  class Program
  {
    static void Main(string[] args)
    {
      try
      {
        Console.WriteLine("Started");
        DoStuff();
      }
      catch (Exception ex)
      {
        Console.WriteLine("Error: " + ex);
      }
      Console.WriteLine("Done");
    }

    public static void DoStuff()
    {
      throw new ApplicationException("test1");
    }
  }
}

If you compile this in release mode, it creates the exe and pdb. If you run this with the pdb existing, it will output:

Started
Error: System.ApplicationException: test1
at ExceptionDemo.Program.DoStuff() in C:\Temp\Program.cs:line 26
at ExceptionDemo.Program.Main(String[] args) in C:\Temp\Program.cs:line 15
Done

If you then delete the pdb, it outputs:

Started
Error: System.ApplicationException: test1
at ExceptionDemo.Program.DoStuff()
at ExceptionDemo.Program.Main(String[] args)
Done

Moral of the story - including your PDB files with the released code lets your exceptions automatically pick up the extra helpful info like line numbers and file names. Of course, there's always a catch - PDB files may expose your intellectual property, so you probably don't want to ship them. So, the ideal situation would be to keep your PDB files for the release builds, don't ship them, but have a way to look up the line and file info whenever the application logs an error. Can we do this? Yes, using several other techniques that we'll discuss tomorrow.

Note - including the PDB file is NOT the same as shipping the debug mode. Debug mode fundamentally can compile different code, like using the #IF DEBUG declaratives. Merely adding the PDB file does not suddenly turn the release build into the debug build.

Sunday, July 27, 2008

Code Reviews - objections and counter-objections

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/code_reviews__objections_and_counterobjections.htm]

It's a good thing to have another person review the production code that a developer writes. Two heads are better than one. Code Reviews offer many benefits, especially catching bugs when they're cheaper to fix and sharing knowledge across the team. However, some people still have a lot of resistance to code reviews. Here are some common objections to code reviews, and problems with those objections.

Reason to not do a code review	Problem with that reason
It's my code, I don't want anyone messing with my code.	Technically, it's not your code - it's your company's code. They're paying for it.
I don't have time.	Code reviews aren't about wasting time discussing pointless trivia, they're about saving time by double-checking the code upfront, where bugs are much cheaper to fix than once those bugs have propagated all the way to production.
My code isn't ready yet to be reviewed.	Some devs want to first write a 2-month feature, have it work perfectly, and then essentially have a quick code-review meeting that rubber-stamps their amazing feature. But what good is a 15 minute review after two months of work? How often should you do code reviews? It should be frequent enough such that there's still time to act on it.
I'm a senior dev, I don't need to have some junior dev telling me what to do.	There are good reasons for a junior dev to review a senior dev's code, such as helping that junior dev to learn, which in turn benefits the whole team.
My code will already work (I tested it myself) - it doesn't need a code review.	This just isn't probable. We humans are fallible creatures, and even the best of us makes mistakes. Even if a developer's code is functionally perfect, maybe it can still be improved by refactoring, or using better coding tips or team-build components. And if the code is truly perfect in a way that it cannot be improved, it would be great for other developers to review it such that they learn from it.
My code is too complicated to explain in a code review.	If the code is truly too complicated, that's exactly why it should be reviewed - such that other team members can see how to make it simpler, or at least start understanding it so that other people are prepared to maintain it when you cannot.

Thursday, July 24, 2008

Bugs - kill them when it's cheapest

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/bugs__kill_them_when_its_cheapest.htm]

The sooner in the development life cycle that you catch a bug, the cheaper it is to fix. A simple design flaw consumes hours of coding time, more hours of testing time, potentially gets passed into training and written into documentation, has other components built on top of it, and gets deployed into production. Sometimes the original developer is gone, leaving the team precious little knowledge of how to change the erroneous code. I've seen many projects where a production error, something as simple as an ill-placed null reference exception, pulls the entire team into fixing it. Usually there are people shaking their head, "if we had just spent 60 seconds to write that null-check when we first developed the code." It's sad, but true.

As time increases, bugs become more expensive to fix because:

The bug propagates itself - it gets copied around, or other code gets built on top of it.
The code becomes less flexible - the project hits a code-freeze date, or the code gets deployed (where it's much harder to change than when you're first developing it)
People lose knowledge of the erroneous code - they forget about that "legacy" code, or even leave the team.

This is why many of the popular, industry best practices, are weighted to help catch bugs early - code reviews up front, allow time for proper design, unit tests, code generation, and setting up process the right way, etc... But, a lot of development shops, perhaps because they're so eager to get code out now, often punt ("we'll fix it later"), and end up fixing the bugs when they're most expensive. That may be what's necessary when a project first starts ("There won't be a tomorrow if we don't get this code out now"), but eventually it's got to shift gears and kill the bugs when they're the cheapest.

Tuesday, July 22, 2008

Screen scraping the easy way with .Net

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/screen_scraping_the_easy_way_with_net.htm]

Sometimes you may want to collect mass amounts of data from many web pages, and the easiest way is to just screen-scrape it. For example, perhaps a site doesn't provide any other data export mechanism, or it only lets you look up one item at a time, but you really want to look up 1000 items. That's where you have an application request the html page, then parse through the response to get the data you want. This is becoming rarer and rarer as RSS feeds and data exporting becomes more popular. However, when you need to screen scrape, you really need to screen scrape. .Net makes it very easy:

WebClient w = new WebClient();
string strHtml = w.DownloadString(strUrl);

Using the WebClient class (in the System.Net namespace), you can simply call the DownloadString method, pass in the url, and it returns a string of html. From there, you can parse through with Regular Expressions, or perhaps an open-source html parser. It's almost too easy. Note that you don't need to call this from an ASP.Net web app - you could call it from any .Net app (console, service, windows forms, etc...). Scott Mitchell wrote a very good article about screen-scraping back in .Net 1.0, but I think new features since then have made it easier.

You could also use this for a crude form of web functional testing (if you didn't use MVC, and you didn't have VS Testers edition with MSTest function tests), or to write other web analysis tools (is the rendered html valid, are there any broken links, etc...)

Thursday, July 17, 2008

.Net is like the galaxy, they're both big and getting bigger

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/net_is_like_the_galaxy_theyre_both_big_and_getting_bigger.htm]

Like the galaxy, .Net is big, and its only getting bigger. It stretches as far as the eye can see, or better yet, as far as the mind can think We're now in the 5th release of .Net (1.0, 1.1, 2.0, 3.0, 3.5), each one adding more to the previous. This includes not just a bigger API, but fundamentally new technologies and techniques - Ajax, WPF (with Xaml), Silverlight, WCF, WWF, etc... The .Net ecosystem is growing too - with open source, guidance, blogs, and vendors. It is expanding across all aspects of development (including games, mobile devices, enterprise apps, rich media, hobbyists apps, etc...).

I see at least three practical consequences of this:

It's too big for one person to "know it all". This is why prescriptive guidance and community consensus are so important. It also gives hope to younger developers - I've gotten to work with several younger new-hires, who initially think that they'll never make some innovative contribution to the team. I explain to them that because .Net is so big, as long as they keep trying, it's inevitable, they'll eventually come to a new frontier that no-one else on the team has seen - a new tool, a new trick, they'll be the first to pick up a new technology.
How does someone keep up? There are plenty of ways to learn about .Net. However, the vastness of it all does force a normal person to pick a niche. It helps to pick, or work towards, a niche that you enjoy. By making learning a lifestyle, a developer can continually pick up new things. It also helps that .Net is growing in a good direction...
It's growing in a good direction. It's not that .Net is expanding into chaos, but rather it's growing more and more powerful. Part of this is retiring older technologies, either by making them obsolete (who uses COM), or wrapping them with an more convenient technique (a Domain Specific Language, an easier tool or API). The new enhancements aren't making us developers dumber, but rather freeing us up to focus on more interesting problems.

I see these as good things. Software engineering's continual expansion is one of the things that so fascinates me with the field.

Wednesday, July 16, 2008

Book: Beyond Bullet Points

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/book_beyond_bullet_points.htm]

The corporate world is filled with endless PowerPoint presentations. Many of these are just templated slide after slide of bullet points, which can be boring. A recent book I read, Beyond Bullet Points, by Cliff Atkinson, explained an alternative technique to make PowerPoint more interesting. His idea (as best I understand it) is to mimic what other successful media do (like Hollywood) by telling a story with pictures instead of using bullet points. The end result is that it emphasizes the speaker's own words rather than endless PowerPoint text. I had the opportunity to attend the USNAF back in 2006, and several presentations used this technique, and it was indeed more lively.

Tuesday, July 15, 2008

The difference between projects, namespaces, assemblies, and physical source code files.

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/the_difference_between_projects_namespaces_assemblies_and.htm]

When creating simple applications, the project, namespace, assembly, and physical source code file usually are all related. For example, if you create a project "ClassLibrary1", it compiles to an assembly "ClassLibrary1.dll", creates a default class in namespace "ClassLibrary1", creates a folder "ClassLibrary1", and places the source code within that folder. Everything is consistent and simple.

However, simple is not always enough. These four things can all be independent.

Project - The visual studio project that contains all the source code (and other build types like embedded resources), which gets compiled into the assembly. A project can reference any file - including files outside of its folder structure. By opening the project in notepad, you can manually edit the include path to be an external reference: . The file icon will now look like a shortcut.
Assembly - The physical dll that your code gets compiled to. One assembly can have many namespaces.
Namespace - The namespace is used to organize your classes. You can change the namespaces to anything you want using the namespace keyword. It does not need to match the assembly or folder structure.
Source Code - This does not need to be stored in the same directory as the project. So, you could have several projects all reference the same source code file. For example, you may have one master AssemblyInfo file that stores the main version, and then all your projects reference that file.

So, if you have an aspx page referencing "ClassLibrary1.Class1.DoStuff()", it doesn't care if that class is in Assembly "ClassLibrary1.dll" or "ClassLibrary1Other.dll", as long as it has access to both assemblies and the namespace is the same.

This can be useful for deployment, or sharing global files across multiple projects, or just neat-to-know trivia.

Sunday, July 13, 2008

Ideas to encourage your boss to invest in Silverlight

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/ideas_to_encourage_your_boss_to_invest_in_silverlight.htm]

Silverlight has a lot of benefits, but as a new technology, it also has problems. As a new technology, it is inevitably riskier as many of the kinks haven't been worked out yet. Managers, who want to avoid unnecessary risk, may shy away from such a technology. However, there are ways to encourage a manager to at least consider Silverlight:

Show an actual demo of what Silverlight can do (such as on the gallery). Talk is cheap, but seeing Silverlight in action is powerful.
Where feasible, consider developing simple internal tools with Silverlight. Managers almost expect devs to always insist on using the latest technology, regardless of it's business value. But if you believe enough in the tech to invest your own time learning it and applying it to a simple business problem that your department faces - that carries a lot of weight.
Emphasize the aspects of Silverlight that would benefit your team - perhaps a rich UI with animating charts, or drag and drop, or rich media, or C# on the client, or cross-browser, etc...
If all else fails, consider a little fear-mongering: "Our competitors will be using this". If not Silverlight, at least a Silverlight-competitor like flash.

Some managers were hesitant when JS came out ("it's got cross-browser problems", "not all client support it"), when .Net came out ("J2EE is the established enterprise platform"), when Ajax came out ("it will have security holes"), etc... There's understandably going to be some skepticism with Silverlight too, but that's ok. I personally believe that Silverlight can deliver, and therefore instead of trying to encourage managers to adopt it, managers will be recruiting developers who know it.

Wednesday, July 9, 2008

Persisting data in a web app by using frames

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/persisting_data_in_a_web_app_by_using_frames.htm]

A basic problem with developing web applications is that their foundation technology, html, is stateless. That means that you constantly need to jump through hoops in order to pass data from page1 to page2. Of course there are ways to solve this, such as using ASP.Net session state, querystrings, cookies, or persisting to a database. There is another way that may work for simple data if your app is hosted in a frame.

Say you have your main page, which is just a frameset. All the navigation occurs within that frameset, such that going from page1 to page2 merely updates the frame's url, it doesn't re-create the host page. This leaves the host page intact, including it's JavaScript state. Therefore, you could have a JavaScript variable persist data between pages.

<html>
<head>
    <title>My Apptitle>
    <script language="javascript" type="text/javascript">
      var _javaScriptVar = null;
    script>
head>
<frameset>
      <frame src="Page1.aspx" id="mainFrame">
frameset>
html>

You could then reference this variable from your child pages via the DOM:

window.parent._javaScriptVar = "someValue";

This means that page1 could set the value, and page2 could retrieve that value. To the end user, it looks like data has been persisted across pages. You could also expand this using JavaScript hashtables to store name-value pairs of data, and then add wrapper methods for an easy API. This is a surprisingly simple approach, and it has pros and cons:

Pro:

Very easy to implement for new apps
Scalable - as it stores data on the client, instead of on the server (like session state)
Can store strongly-typed data. This saves to a JavaScript variable, which can store complex data as opposed to just strings (although you could just use JSON to serialize most complex objects to a string and back)
It avoids cookies, which have their own limits and problems.

Con:

It messes up your URLs, as the user only sees the URL for the host page, not the child pages. (But this may be a good thing)
It is absolutely not secure, as any hacker could modify the JavaScript variables.
It does not persist across sessions - it's only good for convenience data on the UI.

Overall, it's a cute trick for certain apps. Although, I'd rather use Silverlight if I could.

Monday, July 7, 2008

Two limits with Silverlight (Beta 2)

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/two_limits_with_silverlight_beta_2.htm]

Silverlight has several fundamental benefits. However, there's always a flip-side, and it has some shortcomings too. There are at least two major limits that I see:

Silverlight requires a separate plug-in. Although Flash also requires a plug-in, Flash has something like 98% market share, and is essentially as available as JavaScript. For Silverlight though, this separate plug-in will make many business sponsors take a second look. Of course, MS knows this and is actively working on it - they'll use the full dominance of MS sites (hotmail, msn, etc...) to prompt your to download Silverlight, they'll make it an automatic update so system admins can easily install it across the enterprise, they'll include it in future products, they'll convince popular sites to use it, and hence encourage all those extra viewers to download it. This separate plug-in is a limit, but not a show-stopper, especially for private or intranet apps.
Silverlight is still a very young technology. After the JS release, and a 2.0 alpha, beta1, and beta2, it still doesn't even have a combo box! However, I'd expect that the Microsoft eco-system will rush to fill in these gaps via open source and the Silverlight community. Silverlight is young, but I'd expect the Microsoft faithful and developer community will make it grow fast.

As a developer, I realize that Silverlight has its problems, and an uphill climb, but I'm optimistic. I think that soon its strengths will outweigh its weaknesses.

Thursday, July 3, 2008

What is the best way to learn coding?

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/what_is_the_best_way_to_learn_coding.htm]

I have the opportunity to do a good amount of interviewing, and it lets me see many ways that developers promote themselves:

Years of experience
Work at a prestigious company
Professional awards
Certifications
Academic degrees (bachelors, masters)
Attended training classes
Buzzwords
Job title ("Extra Super Senior Technical Specialist Level 3++, with Honors")
"I've read every tutorial on w3schools"
Various activities on their last job

At the end of the day, these are all good, but people who have these still fail simple coding questions. They can talk a good talk, pass multiple choice tests, but struggle when trying to write 10 lines of C# on a whiteboard. Perhaps the #1 indicator of a good developer is that they build their own personal coding projects - from scratch. Not just configure some buzzword package, but actually write, compile, deploy, maintain, and improve their own personal pet project. For example, many of the best developers I know are those who got started by writing their own computer games. I think this actually makes sense for a lot of reasons.

Good practice - If you're asked to write code in an interview, what better way to practice than by writing code on your own?
Emotional attachment - You have a vested interest in your own pet projects, and a vested interest in their success. Therefore, you'll inevitably be more eager to learn and understand the coding techniques involved, as opposed to some "boring project for work."
Small and flexible - A pet project is small, so it's flexible and easy to change - you're not constantly dragging around years of legacy code.
Easier to try new things - You're more likely to try new things for your own project, than risk screwing up the company's flagship product.
You see the big picture - You see your own project end-to-end, in its full context, as opposed to just a small niche of a much larger product.
Lets you focus - A small project, of your own interest, lets you focus on just the specific tech you want, as opposed to writing thousands of lines of redundant or plumbing code for work. It's often a minimalist example of a some interesting technology, because once it gets bloated, it stops being interesting, and the developer stops working on that hobby project, and moves onto another one.
Shows motivation - Someone who invests the energy to write their own application, off the clock, is probably motivated to learn new technology and tricks for their work project.

There are tons of fun, pet projects you could build. And with free open-source hosting with places like CodePlex, you can easily share that code with others. If anyone has experienced a better way to learn coding, I'm all ears.

Wednesday, July 2, 2008

Why you need to be able to write code on a whiteboard

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/why_you_need_to_be_able_to_write_code_on_a_whiteboard.htm]

During a software engineering interview, you need to be able to write code on a whiteboard. During too many interviews, at multiple companies, I've seen candidate after candidate struggle to write simple code on the whiteboard. Many of these candidates are decent people, with CS degrees, Master degrees , "years or experience", "senior developer" titles, etc... Yet they cannot write 10 lines of correct code on a whiteboard.

For example, such candidates will struggle to answer "given a array of integers, write a method that returns the sum of all the positive numbers." Seriously. This is just a few lines of trivial code - no advanced API, no design patterns, no trivia, no tricks. It's what you'd see in a college CS101, first semester exam:

public static int GetPositiveSum(int[] aInt)
{
  if (aInt == null)
    return 0;

  int intSum = 0;
  foreach (int i in aInt)
  {
    if (i > 0)
      intSum += i;
  }

  return intSum;
}

I've seen experienced, honest, candidates continually miss code of this simplicity. They'll have logic, syntax, or style problems:

Logic - They'll add the numbers wrong, such as overwriting sum (intSum = i), or adding it exponentially (i += i), or completely ignore any validation (what if an input parameter of a public method is null?) It's one thing not to get bogged down in tedious plumbing, but a candidate should be prepared to call out if something could crash their code.
Syntax - A significant chunk of developers just dismiss syntax, emphasizing that they know the concepts instead. Good recruiters don't care about trivia, like if you miss a semi-colon at the end. But I've seen people have the wrong method signature, declare a variable wrong, have the wrong for-loop syntax, or reference an array wrong. A mistype is okay, but when you try to drill down on concepts ("Does the variable intSum need to first be assigned before being used?", "Will your loop execute the correct number of times?"), and they shrug it off, that's not good.
Style - Style is subjective, but important none-the-less. A good recruiter doesn't care if you indent with two spaces or four spaces. But there are other "style choices" that really do matter. I've seen developers declare their variable out of scope, such as using an extra static member instead of keeping it encapsulated as an instance field. I've also seen many devs just dismiss validation by wrapping the method in a try-catch. Or they'll make a method instance when it could be static instead. It's okay to have personal coding preferences, but these kind of "style" things actually affect the functionality of the code. A candidate should be prepared to explain why these made certain "style" choices.

Obviously, recruiters know that the candidate can write code, and could stumble through 10 lines of C#. But the idea is that if a candidate struggles to write trivial code, without the aid of Visual Studio's intellisense, compiler, and debugger, then they don't really get it. In other words, if a candidate uses VS as a crutch for simple code, then they're probably just "coding by coincidence" as opposed to proactively thinking through the code. And while you can "code by coincidence" for easy problems, on small, standard, applications, it will result in endless bugs on larger, complex apps. For example, if a candidate doesn't realize that they need to check for a null input parameter on simple code (even when prompted), how can they be expected to validate complex and critical code?

Some interviewees seem to dismiss these coding questions as "beneath them". The problem is that you must judge the unknown by the known. If a recruiter observes the candidate mess up simple code (that the recruiter can see), they'll be less likely impressed by fancy-sounding projects that they [the recruiter] cannot see. In other words, if for whatever reason the candidate cannot conceptually work through simple code, most recruiters won't even pay attention to all that allegedly complex code that the developer wrote elsewhere.

As a candidate, this coding on the whiteboard is the chance to shine. This is not the time to say things like "I guess this is how you do it", or "I'm trying to remember back in my college days". Rather, this is where the candidate can show that they know C# so well such that they can write it straight - without any crutch - and then explain it, and then adapt it on the fly. Now that's a great way to get off to a good start in an interview.

FYI, here are some of the unit tests that I'd run the above code through:

Assert.AreEqual(0, GetPositiveSum(null));

Assert.AreEqual(0, GetPositiveSum(new int[] { 0 }));
Assert.AreEqual(0, GetPositiveSum(new int[] { -1, -4, -4 }));
Assert.AreEqual(0, GetPositiveSum(new int[] { 0, -4 }));
Assert.AreEqual(0, GetPositiveSum(new int[] { Int32.MinValue }));

Assert.AreEqual(4, GetPositiveSum(new int[] { 4 }));
Assert.AreEqual(10, GetPositiveSum(new int[] { 1,2,3,4 }));
Assert.AreEqual(4, GetPositiveSum(new int[] { -4, 4 }));
Assert.AreEqual(Int32.MaxValue, GetPositiveSum(new int[] { Int32.MaxValue }));
Assert.AreEqual(3, GetPositiveSum(new int[] { 0, 1, 2, -3 }));

Tuesday, July 1, 2008

Why you shouldn't just wrap simple code in a try-catch

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/why_you_shouldnt_just_wrap_simple_code_in_a_trycatch.htm]

Structured programming languages, like C#, provide a try-catch ability to wrap an entire block of code, and catch if anything goes wrong. This can massively simplify error checking, especially when calling external components. However, like any good thing, it can also be abused. During interviews, or rushing out code, many developers resort to using try-catch as a quick way of doing error checking, but there's a catch (excuse the pun). For example, consider this trivial method to sum up an array of integers:

    public static int GetSum(int[] aInt)
    {
      int intSum = 0;
      foreach (int i in aInt)
      {
          intSum += i;
      }
      return intSum;
    }

You can crash this code, such as by passing in null for the int array. To blindly wrap a simple method with try catch, just to catch normal logic errors like null variables or indexOutOfRange, has problems.

For performance, Try-catch is enormously expensive compared to an if-then check. It could be hundreds of times slower. So it's bad design to use a try-catch for something trivial when an in-then will do just file, like checking if a variable is null.
What will you do in the catch statement? Some devs may say "I'll log my exception info here" - but what info is that... that the developer didn't bother to do basic error checking?
It makes the usability worse. Perhaps instead of re-throwing an exception, or logging and going to the global error page, the method could have used an if-then to return a sentinel value instead. For example, perhaps a string manipulation method could just return the original value (or null) if the input parameters were invalid.
It loses context. Say your code catches a null exception - that thrown exception doesn't tell you what variable was null. You've lost context and information with which to fix the code. Using an in-then would have given you full access to all the local variables and their context.
It makes all exceptions equal - a null or indexOutOfRange exception (things that are easy to catch with if-thens) are put on the same level as a critical outOfMemory error - i.e. both trigger the catch block. This in turn can distract the maintenance team, who gets swamped with tons of exceptions, most of them preventable.
Perhaps worst of all, it implies that the developer isn't thinking about how their method can fail. Rather than understanding the code and what it does, they just wrap it all in a try-catch, and move on. This means they could have not coded for valid use cases.

Try-catch has obvious benefits, but it has limitations and consequences too. There are times when it's much more beneficial to explicitly catch the error conditions with an if-then instead.