Skip to content

Python 3.12 xgboost.core.XGBoostError: Invalid Parameter format for nthread expect int but value='-1' when DMatrix used with import googlecloudprofiler. #144

@hwlodarczyk-rtbh

Description

@hwlodarczyk-rtbh

This issue was originally posted in xgboost repo dmlc/xgboost#10224 .

Hi

I have a very peculiar error which happened when I've updated versions of Python and libs in project I'm working on.

Minimal example to reproduce the case is this:

# file.py
import googlecloudprofiler
from xgboost import DMatrix

DMatrix([[]])
print("works")
# requirements.txt
xgboost==2.0.3
google-cloud-profiler==4.1.0
#
numpy==1.26.4
scipy==1.13.0
google-api-python-client==2.125.0
google-auth==2.29.0
google-auth-httplib2==0.2.0
protobuf==4.25.3
requests==2.31.0
#
cachetools==5.3.3
certifi==2024.2.2
charset-normalizer==3.3.2
google-api-core==2.18.0
httplib2==0.22.0
idna==3.6
pyasn1==0.6.0
pyasn1_modules==0.4.0
pyparsing==3.1.2
rsa==4.9
uritemplate==4.1.1
urllib3==2.2.1

Python 3.12.2

Install with

pip install -r requirements.txt --no-deps

Run with

python file.py

Results in

Traceback (most recent call last):
  File "/project/path/file.py", line 4, in <module>
    DMatrix([[]])
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 730, in inner_f
    return func(**kwargs)
           ^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 857, in __init__
    handle, feature_names, feature_types = dispatch_data_backend(
                                           ^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 1081, in dispatch_data_backend
    return _from_list(data, missing, threads, feature_names, feature_types)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 1011, in _from_list
    return _from_numpy_array(array, missing, n_threads, feature_names, feature_types)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 207, in _from_numpy_array
    _check_call(
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 282, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: Invalid Parameter format for nthread expect int but value='-1'

To "solve" the problem remove import googlecloudprofiler from file.py. I really have no idea why just importing the lib causes this problem; it would make more sense after googlecloudprofiler.start is called.

Moreover the code works for xgboost=1.7.6 and fails since xgboost=2.0.0.

Maintainer of xgboost mentioned

loading the _profiler.cpython-312-x86_64-linux-gnu.so inside google profiler extension causes the error

dmlc/xgboost#10224 (comment)

This is why I've opened issue here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions