ML.NET: can Microsoft's machine learning be trusted?

In 2018, Microsoft created МL.NЕТ, a machine learning framework for .NET developers. Since then, the machine learning library has undergone significant changes and acquired new features to identify patterns within data. Let's see how these changes have affected the quality of МL.NЕТ source code. And explore the errors, of course!

image1.png

Introduction

Before we start, let's have a closer look at the framework. Microsoft created МL.NЕТ as a free, open-source machine learning library for the C# and F# programming languages. МL.NЕТ allows you to build functionality for automated predictions using the dataset available to your application.

You can use МL.NЕТ to train a custom machine learning model by specifying an algorithm, or you can import the pre-trained TensorFlow and ONNX models.

You can add the model to your application to get predictions.

At the time of writing, МL.NЕТ included the following: classification/categorization, regression, anomaly detection, recommender systems implementation, image classification, time series and sequential data analysis.

Knowing what МL.NЕТ is, we can now move on to inspect the errors PVS-Studio found. But first, we should give credit to Microsoft developers, since the project's code is of high quality: it was quite a challenge to find any terrible or curious errors for this article. Yet, there are a few critical issues worthy of attention.

We checked the МL.NЕТ 1.7.1 version. The source code of this project's version is available on GitHub.

So, let's dive in and see what PVS-Studio found in МL.NЕТ!

МL.NЕТ error review

An object was used after a null check

Issue 1

private EstimatorChain(...., IEstimator<ITransformer>[] estimators, ....)
{
  ....
  _estimators = estimators ?? new IEstimator<ITransformer>[0]; // <=
  ....
  LastEstimator = estimators.LastOrDefault() as // <=
                  IEstimator<TLastTransformer>;

  _needCacheAfter = needCacheAfter ?? new bool[0];
  ....
}

PVS-Studio warning: V3125. The 'estimators' object was used after it was verified against null. EstimatorChain.cs 37, 39

Let's take a closer look at the warning. PVS-Studio detected a possible error that could lead to a null reference dereferencing. Let's examine the code. The null-coalescing operator checks whether the estimators input parameter carries a value. The operator's result is recorded to the _estimators field.

On the line below, developers use the estimators parameter once again, calling the LastOrDefault method. Accessing the estimators parameter that may have a null value may cause NullReferenceException.

The solution to this problem is to use the _estimators field — instead of the estimators variable — to call the LastOrDefault method:

LastEstimator = _estimators.LastOrDefault() as IEstimator<TLastTransformer>;

Names of the estimators parameter and the _estimators class field are quite similar. Developers may have just mixed them up. When writing code, they could easily miss such a small detail. But the static analyzer pays attention even to the smallest elements of code. Thus, you do not have to search for these errors and correct them afterwards.

Two 'if' statements with identical conditional expressions

Issue 2

private protected override void CheckOptions(IChannel ch)
{
  ....
  if (FastTreeTrainerOptions.NumberOfLeaves > 2 &&
      FastTreeTrainerOptions.HistogramPoolSize >
      FastTreeTrainerOptions.NumberOfLeaves - 1)
  {
    throw ch.Except("Histogram pool size (ps) must be at least 2."); 
  }

  if (FastTreeTrainerOptions.NumberOfLeaves > 2 &&
      FastTreeTrainerOptions.HistogramPoolSize >
      FastTreeTrainerOptions.NumberOfLeaves - 1)
  {
    throw ch.Except("Histogram pool size (ps) must be at most numLeaves - 1."); 
  }
  ....
}

PVS-Studio warning: V3021There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless. BoostingFastTree.cs 44, 47

The analyzer detected two similar if statements with identical conditional expressions. The program terminates since the if block throws an exception. It may just look like a useless pile of code, but let's inspect the problem from a different angle. Look closer at the exceptions' error messages – they are not identical. The developers probably intended to implement two different checks. They may have copied the check, changed the message, but forgot to correct the condition. Thus, one possible error remained unhandled and still poses a threat to the full and correct operation of the program.

The variable was used after it was assigned a value through null-conditional operator

Issue 3

private static int CheckKeyLabelColumnCore<T>(....) where T : ....
{
  ....
  for (int i = 1; i < models.Length; i++)
  {
    ....
    var mdType = labelCol.Annotations
                         .Schema
                         .GetColumnOrNull(....) // <=
     ?.Type;

    if (!mdType.Equals(keyValuesType))
      throw env.Except(....);
    ....
  }
  ....
}

PVS-Studio warning: V3105The 'mdType' variable was used after it was assigned through null-conditional operator. NullReferenceException is possible. PipelineEnsemble.cs 670

This time, the analyzer issues warnings on a possible occurrence of NullReferenceException when dereferencing a variable that may the null value. The analyzer thinks so because there is no null check between when the value is assigned and when it is used. Look at the code. The mdType variable is assigned with the value through a call chain. One of the parts of the chain is the GetColumnOrNull method, which returns null under a certain condition:

public Column? GetColumnOrNull(string name)
{
  if (string.IsNullOrEmpty(name))
    throw new ArgumentNullException(nameof(name));
  if (_nameMap.TryGetValue(name, out int col))
    return _columns[col];
  return null;
}

Immediately after the value is assigned, the mdType variable is accessed to call the Equals method in the condition statement. The code of this kind may result in an exception. The developers probably forgot to handle this case with the check.

Incorrect order of arguments passed to method

Issue 4

private IProgressChannel StartProgressChannel(int level)
{
  var newId = Interlocked.Increment(ref _maxSubId);
  return new SubChannel(this, level, newId);
}

PVS-Studio warning: V3066. Possible incorrect order of arguments passed to method. ProgressReporter.cs 148

The analyzer noticed that the order of arguments passed to the method is possibly incorrect. Indeed, the StartProgressChannel method returns an instance of the SubChannel class, whose constructor received the parameters in the incorrect order (id and level are mixed up). It's clear from the SubChannel class implementation:

public SubChannel(ProgressChannel root, int id, int level)
{
  ....
  id = id;
  level = level;
  ....
}

It seems like a simple typo! Even so, it may cause serious issues. Pay attention to the SubChannel constructor class implementation. Here, the _id and _level fields are assigned incorrect data. These fields may have a negative effect on how the class works.

In fact, such errors often occur in large-scale projects. As you see, МL.NЕТ is no exception. Here's another example:

Issue 5

internal sealed class LdaSingleBox : IDisposable
{
  ....
  public void AllocateModelMemory(int numTopic, int numVocab, ....)
  {
    ....
    LdaInterface.AllocateModelMemory(...., numVocab, numTopic, ....); // <=
  }
  ....
}

PVS-Studio warning: V3066. Possible incorrect order of arguments passed to 'AllocateModelMemory' method: 'numVocab' and 'numTopic'. LdaSingleBox.cs 142

Here, when the LdaInterface.AllocateModelMemory method is called, the input parameters (numTopic and numVocab) are mixed up. It's clear from the way the parameters were passed to the method's signature:

internal static class LdaInterface
{
  ....
  internal static extern void AllocateModelMemory(....,
                                                  int numTopic, // <=
                                                  int numVocab, // <=
                                                  ....);
  ....
}

An object was used after checking for null

Issue 6

IPredictor IModelCombiner.CombineModels(....)
{
  ....
  foreach (var model in models)
  {
    ....
    var calibrated = predictor as IWeaklyTypedCalibratedModelParameters;
    ....
    if (calibrated != null)
      _host.Check(....);

    predictor = calibrated.WeaklyTypedSubModel; // <=
    paramA = -((PlattCalibrator)calibrated.WeaklyTypedCalibrator).Slope; // <=
    ....
    foreach (var t in tree.TrainedEnsemble.Trees)
    {
      ....
      if (modelCount == 1)
      {
        binaryClassifier = calibrated != null;
        ....
      }
      else
      {
        _host.Check((calibrated != null) == binaryClassifier, ....);
        ....
      }
    }
    ....
  }
}

PVS-Studio warning: V3125. The 'calibrated' object was used after it was verified against null. TreeEnsembleCombiner.cs 54, 58, 59

Here's another way to see a NullReferenceException exception. Pay attention to the calibrated variable. This variable is initialized through the as operator, which returns null if the type conversion fails. This variable is then used unchecked:

....
predictor = calibrated.WeaklyTypedSubModel; // <=
paramA = -((PlattCalibrator)calibrated.WeaklyTypedCalibrator).Slope;
....

The calibrated variable is checked for null three times in different parts of the method. We can safely say that this problem is highly relevant for the МL.NЕТ project.

I would note that such errors are most encountered in this project, that's why I suggest considering a few more curious cases:

Issue 7

private void LoadSampledLabels(....)
{
  ....
  Contracts.Check(_instanceWeights == null || ....);
  if (....)
  {
    for (....)
    {
      ....
      weights[k] = (float)_instanceWeights[j];
    }
  }
  ....
}

PVS-Studio warning: V3125. The '_instanceWeights' object was used after it was verified against null. InternalQuantileRegressionTree.cs 81, 74

Here, the null check of the _instanceWeights field is implemented through the Contracts.Check method:

public static void Check(bool f, string msg)
{
  if (!f)
    throw Except(msg);
}

If _instanceWeights is null, the Contracts.Check method will handle the error by throwing NullReferenceException. It seems to be the developer's initial idea, but things turned out differently. Note how the Contracts.Check method is implemented. If _instanceWeights == null is true, the true value is passed to the Contracts.Check method as the f parameter and then the value is converted to false. As a result, the check for null would be incorrect and the code would keep executing. The next time the _instanceWeights is accessed, a NullReferenceException would be thrown.

Issue 8

private static List<....> Train(....)
{
  ....
  for (int i = 0; i < columns.Length; i++)
  {
    ....
    var srcColType = inputSchema[srcCol].Type as VectorDataViewType;
    if (srcColType == null || ....) // <=
      throw env.ExceptSchemaMismatch(...., srcColType.ToString()); // <=
    ....
  }
  ....
}

PVS-Studio warning: V3125. The 'srcColType' object was used after it was verified against null. Check lines: 843, 842. LdaTransform.cs 842, 843

This case struck me as curious since everything seemed to be fine here:

  • the type conversion is performed through the as operator, and then the result is written in the srcColType variable;
  • the srcColType variable is checked for null, and if the check succeeds, an exception is thrown with the error message.

And everything would be fine. However, the result of the srcColType.ToString() method, called for a potentially empty variable, is passed as the last parameter to the exception:

....
throw env.ExceptSchemaMismatch(...., srcColType.ToString());
....

Although this error does not seem extremely serious, it still has a pretty unpleasant consequence: some essential information for debugging (the error message) gets lost. In this case the information is particularly important, since the error is in a loop, and the number of iterations in the loop may reach several dozen. In similar cases, we need to figure out in what context the error occurred.

An expression is always false

Issue 9

public static IntArray New(...., IntArrayType type, IntArrayBits bitsPerItem)
{
  ....
  Contracts.CheckParam(type == IntArrayType.Current ||
                       type == IntArrayType.Repeat ||
                       type == IntArrayType.Segmented, nameof(type));

  if (type == IntArrayType.Dense || bitsPerItem == IntArrayBits.Bits0)
  {
    ....
  }
  else if (type == IntArrayType.Sparse)
    return new DeltaSparseIntArray(length, bitsPerItem);
  ....
  return null;
}

PVS-Studio warnings:

V3022. Expression 'type == IntArrayType.Sparse' is always false. IntArray.cs 145 V3063. A part of conditional expression is always false if it is evaluated: type == IntArrayType.Dense. IntArray.cs 129 In this case, two warnings were triggered by the same cause. That's another curious case, so let's get into it. Under given conditions, type may have IntArrayType.Sparse or IntArrayType.Dense value. But the analyzer finds it wrong. Is the analyzer right? Let's solve this case. Look at the following code:

Contracts.CheckParam(type == IntArrayType.Current || 
                     type == IntArrayType.Repeat || 
                     type == IntArrayType.Segmented, nameof(type));

Here, the Contracts.CheckParam method is used to check if the type input parameter has one of the following values: IntArrayType.Current, IntArrayType.Repeat, or IntArrayType.Segmented.

Now, let's pay attention to the Contracts.CheckParam method implementation:

public static void CheckParam(bool f, string paramName)
{
  if (!f)
    throw ExceptParam(paramName);
}

As you can see, if type is not equal to any of the above values, an exception is thrown.

As a result, we have two sets of values for the type input parameter:

  • expected values (IntArrayType.Sparse, IntArrayType.Dense);
  • valid values (IntArrayType.Current, IntArrayType.Repeat, IntArrayType.Segmented).

Evidently, the values are completely different, so a serious mistake may have been made here. The case is solved!

We could say that the developers did not plan to enumerate all the valid values, when implementing the above check. But they intended to filter the invalid values for this method. Therefore, the correct implementation of this check may be as follows:

Contracts.CheckParam(type != IntArrayType.Current || 
                     type != IntArrayType.Repeat || 
                     type != IntArrayType.Segmented, nameof(type));

Conclusion

The analyzer detected only a few serious bugs in the МL.NЕТ project ('human' errors in most cases). So, yes, the Microsoft team does know their job. But, anyway, even the professionals can make mistakes. So why not use a static analyzer in your project, thereby minimizing the number of errors and making your projects even better and more reliable?

You can use PVS-Studio for free by downloading the trial version here.

We recommend you to learn the basics of using PVS-Studio in the documentation beforehand.

After all, I would like to wish the company success in the МL.NЕТ machine learning framework developing. I'm highly interested in this project!

So, I hope you found this article curious :).

Good luck and clean code to you! Thank you and see you soon!

Did you find this article valuable?

Support Unicorn Developer by becoming a sponsor. Any amount is appreciated!