If you follow my blog, forum posts or Twitter updates you may have already realized that I care a great deal about the performance of my work. Therefore, I am very keen on doing code profiling once in a while, and I also have clients who wants high performance and capacity and less hardware investments.
In this blog post I give away a few insights based on some of the common findings I make when I do code profiling of Episerver sites. If you are aiming for a high-performing website by means of fast code instead of getting more servers fixed, keep reading.
These are the eight issues:
- Too many nested blocks
- Blocks that are used just for referencing images or page links
- Loading a list of content references sequentially
- Resolving the same URL again and again
- Over-using LINQ and (re-)allocating too many collections
- Too many allocations of arrays and lists
- Too many null-checks on the same property
- Overly complex web of dependency injections
1. Too many nested blocks
I know that it is very usual practice to use blocks for structuring content and data. An example could be blocks that contains a list of blocks that in turn contain a list of blocks.
DO NOT do that!
Issues:
- Nested blocks may be nice to work with for developers. But for editors they are a pain to maintain.
- Rendering nested content areas does not perform well, because blocks are loaded from cache and database, sequentially (one at a time).
- Also, if loading the blocks manually, with IContentLoader, it is easy to forget to use the batch GetItems method. And even if we do so, it has to be done for each nesting level.
Solutions:
- Try to flatten the structure of blocks, even if it means that a few property values will be duplicated.
- Alternatively, use PropertyList<T> for the leaf level content.
2. Blocks that are used just for referencing images or page links
Another thing I have often experienced is blocks used as container for a link to content along with some texts.
For instance, adding an image to a page or a block, some developers will make a separate block type with a content reference and an alt text for that image. It could also be a menu build by menu item blocks, instead of a simple LinkItemCollection.
Issues:
- Like before, this makes for a bad editor experience.
- Having to load a block and then the actual linked content item is bad for performance.
Solutions:
- Add the linked content item, image or page, directly to the content area.
- Alt texts etc. should be on the image content type, anyway.
- In general, use only as many blocks as necessary. Not more than that.
3. Loading a list of content references sequentially
When having to load a list of content references, it is easy to loop through that list, call IContentLoader.Get<T> and build a list of T to return.
Issues:
- Telling Episerver to get one content item at a time is wasteful in terms of repeated method calls, and database round-trips (when items are not already in the cache).
Solutions:
- If you already have a list of references, just use the IContentLoader.GetItems method. Episerver will then load as many items as possible from the cache, and batch-load the rest from the database.
4. Resolving the same URL again and again
Resolving URLs is a must in every Episerver solution. But doing it many times and doing it over and over again can be quite slow.
Issues:
- Instances of CMS content is automatically cached, but the URLs to those instances are not. This makes for some slowness in cases with many links and images, especially under high load.
Solutions:
- Consider caching the resolved URLs. But keep in mind that they need to depend on the content cache key, so that the cached URL is updated or removed with the content.
5. Over-using LINQ and (re-)allocating too many collections
LINQ can make for really readable code. That is great! But LINQ can also be really in-efficient if used wrongly.
Issues:
- When calling ToList() in the end of a LINQ method chain the list is created with the default capacity (which is 4). If there are less items, space is wasted. If there are more, the list’s inner array is replaced and reallocated several times. This also means wasted space.
- Calling ToList() on a chain, then filtering further and finally calling it again is also bad as it spends unneeded time and memory.
- It also seems common to put a Any() call inside a Where() filter. But this is bad, as the inner lookup is performed on each item in the outer list. And if the inner list has its own statements or if it is not a dictionary, it may have to go through each and every item in that list each time. Not good!
Solutions:
- Only call ToArray() or ToList() after the final filter, order or similar enumerator in a LINQ method chain.
- When filtering items that matches something in another list, instead of calling Any() in a Where() filter, really consider using a Join() expression instead.
- If possible, try to make the inner collection a dictionary before joining or looking up items in it. That may speed up a lot. But keep in mind that it has a slight cost to initialize.
- In case the input collection is anything that has a Count or Length property, use that as the initial capacity on the materialization to an array or a list. This is not out-of-the-box, so have a look at my extension methods.
6. Too many allocations of arrays and lists
We always end up working with lists of data of some sorts. But quite often I see eagerly loaded lists of enriched data.
In a view model factory, for instance, we may take a list, do something with it and return a new list. Very usual scenario. Except it is allocating an often-unneeded array of data that could have been lazy-generated.
Issues:
- When creating a list and adding view model objects one by one, the list is actually reallocating several times. That is, unless it fits in the default capacity, or you specify the correct capacity.
- When creating a list in the view model factory and handing it to maybe a Razor view or JSON serializer, the list is not needed. Those are capable of efficiently enumerating and rendering items one-by-one.
Solutions:
- Let the view model factories return IEnumerable<T> and use yield statements inside a loop instead of returning a list. Instead of yielding view model objects, you can also return the LINQ method chain directly (with no list materialization).
- If a list is needed, try to create one with an exact initial capacity. One that matches the actual number of items.
7. Too many null-checks on the same property
This one is simple and may not gain very much performance increase, but it helps keeping the code clean. Other than that, this is pure micro-optimization.
Issues:
- The compiler may emit repeated null check IL instructions to the assembly. The cost of executing each of those is not high, but still it is unnecessary.
Solutions:
- Use a good static code analyzer (I prefer JetBrains ReSharper), and you will see where to put null checks. It will also show you where null checks are not needed, because a null check was already done in the code path.
- If you are certain that a field should never be null, you can put the null check in a Debug.Assert statement instead. When building in optimized mode (Release mode) those will automatically be left out.
8. Overly complex web of dependency injections
Dependency injection is a must for large projects, and it is the recommended (maybe even the only) way of working with Episerver solutions. But we need to keep the dependency graph simple.
I have seen bad examples, where controllers would inject many types, the service layers would inject many types, including some that were already injected into controllers. Some of the services would inject other services that also injected some of the same types.
Issues:
- In a case like the mentioned example, the StructureMap dependency resolver will be a little slower and it will have unneeded memory allocations. This is because it has to go through all the related classes to build a graph of classes. Then it has to construct or look up the required implementations from the leaf and up to the root.
Solutions:
- Try to keep layers of code simple and separate. Ideally controllers should call a few services and maybe a few view model factories.
- Think about the life cycle of services in the injection container. If initializing a class is hard or takes time, consider making it a singleton. If it has no external references at all, consider making it a transient service. Then there are the cases in between, where you have to make a qualified decision.