Common C# LINQ Mistakes and How to Avoid Them
How to write LINQ queries like a Pro
Introduction
C# LINQ (Language Integrated Query) is a powerful tool that allows developers to query and manipulate data concisely and efficiently. However, there are some common mistakes that developers make when working with LINQ. In this article, we will discuss three main points where developers tend to stumble and provide code examples to demonstrate how to avoid these pitfalls.
Feel free to check my article about how LINQ works under the hood where I talk about how to use LINQ, what is lazy evaluation and how you can implement your own LINQ-Like methods.
The examples in this article will utilize the following code (I am sharing it here for more simplicity)
public class Car
{
public string Vin { get; set; } = string.Empty;
public string Make { get; set; } = string.Empty;
public string Model { get; set; } = string.Empty;
public int Price { get; set; }
public int HorsePower { get; set; }
public int Weight { get; set; }
public Energy Energy { get; set; }
public int Height { get; set; }
public int Width { get; set; }
public int Length { get; set; }
public Transmission Transmission { get; set; }
public Traction Traction { get; set; }
public DateTime PurchaseDate { get; set; }
public int Cost { get; set; }
}
public enum Energy
{
Diesel,
Petrol,
Electric,
Hybrid
}
public enum Transmission
{
Manual,
Automatic,
Cvt,
Dct
}
public enum Traction
{
Fwd,
Rwd,
Awd
}
public class CarDealership
{
private IList<Car> cars = new List<Car>();
}
The example above is a class library that will be used to manage a Car Dealership (specifically the car lot). We will add features to the CarDealership class in the code samples we will discuss below.
Inadequate Naming
Single Letter Parameter Names
One of the common mistakes developers make when using LINQ is using single-letter parameter names in lambda functions. While this may seem harmless in simple queries, it can quickly become confusing and make code harder to understand in complex scenarios. To avoid this, it's crucial to use meaningful and descriptive parameter names that reflect the purpose of the query.
Example: Consider the following LINQ query that returns a list of "Make" strings for the remaining Diesel cars in the lot:
public IEnumerable<string> GetRemainingDieselsMakes()
{
return this.cars.GroupBy(c => c.Make)
.Where(g => g.Any(c => c.Energy == Energy.Diesel))
.Select(g=>g.Key);
}
To improve readability, it's recommended to use descriptive parameter names:
public IEnumerable<string> GetRemainingDieselsMakes()
{
return this.cars.GroupBy(car => car.Make)
.Where(carMakeGroup => carMakeGroup.Any(car => car.Energy == Energy.Diesel))
.Select(dieselCarGroup=>dieselCarGroup.Key);
}
By using more descriptive names, such as car
instead of c
and carMakeGroup
, the code becomes more self-explanatory, making it easier to understand and maintain.
Complex Lamba Function
Developers should consider using a method with a descriptive name when the logic inside the lambda expression becomes complex or when the same logic needs to be reused across multiple queries. By encapsulating the logic in a separate method, it becomes self-explanatory, promoting code readability and reducing the chances of introducing errors.
Example:
public IEnumerable<Car> GetCarSuggestionForYoungEnthusiasts()
{
return this.cars.Where(car => car.HorsePower / car.Weight * 10 > 12 &&
car.Price < 15_000);
}
In the example above, the method name gives an idea about what its output represents, but the actual definition of the right car for a young enthusiast is not clear. Here is how using a method can be better than a Lambda function.
public IEnumerable<Car> GetCarSuggestionForYoungEnthusiasts()
{
return this.cars.Where(car => HasHighPowerToWeightRatio(car) &&
IsAffordable(car));
}
public bool HasHighPowerToWeightRatio(Car car)
{
// TODO: magic number is used, extract it to a constant
return car.HorsePower / car.Weight * 10 > 12;
}
public bool IsAffordable(Car car)
{
// TODO: magic number is used, extract it to a constant
return car.Price < 15_000;
}
In the code variation above, we can understand that a car suggestion for a young car enthusiast has a good power-to-weight ratio and is affordable. We can also use the extension method to "attach" the method to the Car class and have an even better code:
public IEnumerable<Car> GetCarSuggestionForYoungEnthusiasts()
{
return this.cars.Where(car => car.HasHighPowerToWeightRatio() &&
car.IsAffordable());
}
public static class CarExtensions
{
public static bool HasHighPowerToWeightRatio(this Car car)
{
// TODO: magic number is used, extract it to a constant
return car.HorsePower / car.Weight * 10 > 12;
}
public static bool IsAffordable(this Car car)
{
// TODO: magic number is used, extract it to a constant
return car.Price < 15_000;
}
}
The other advantage of using methods and extensions methods in LINQ is that they can be reused. Let's implement a method that gives "Winter Beater" suggestions (the winter beater concept is owning a cheap car for the snowy parts of the world where the roads are salted and corrosion is a big issue)
public IEnumerable<Car> GetWinterBeaterCarSuggestion()
{
return this.cars.Where(car => car.Traction is Traction.Awd &&
car.IsAffordable());
}
Here we reused the IsAffordable
extension method and define a Winter Beater as an affordable 4-Wheel Drive car.
Re-evaluating Queries due to Lazy Query Evaluation
C# LINQ employs lazy query evaluation, which means the query is not executed until the result is needed. While this can improve performance by avoiding unnecessary calculations, it can also lead to an unintentional re-evaluation of the query if not handled correctly.
To avoid re-evaluating queries, developers should ensure that the query result is materialized when needed. This can be achieved by using methods like ToList()
, ToArray()
, or FirstOrDefault()
to force the query execution and cache the results.
Example: Consider the following code snippet that mistakenly re-evaluates the query multiple times (Count()
and Average()
both trigger the query evaluation):
public double GetHorspowerAverage()
{
if (this.cars.Count() < 1_000)
{
return this.cars.Average(Car=>Car.HorsePower);
}
else
{
return DoASpecialCalculation(this.cars);
}
}
To avoid re-evaluation, materialize the query results into a list before performing further operations:
public double GetHorspowerAverage()
{
IList<Car> carsList = this.cars.ToList();
if (carsList.Count() < 1_000)
{
return carsList.Average(Car=>Car.HorsePower);
}
else
{
return DoASpecialCalculation(this.cars);
}
}
By using ToList()
at the beginning, the query is evaluated only once, and subsequent operations are performed on the cached results.
Neglecting LINQ Set Operations
LINQ provides powerful set operations like Union
, Except
, and SelectMany
, which are often underutilized by developers. These operations can simplify complex queries and reduce code duplication, leading to cleaner and more efficient code.
To leverage LINQ set operations effectively, developers should familiarize themselves with the available methods and use them when appropriate. Some commonly used set operations include Union()
, Except()
, Intersect()
, and SelectMany()
.
Conclusion
By being mindful of parameter naming in lambda functions, using lazy query evaluation appropriately, and making use of LINQ set operations, developers can avoid common mistakes when working with C# LINQ. These best practices lead to code that is more readable, maintainable, and performant.