Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ If applicable, add screenshots to help explain your problem.
**Desktop (please complete the following information):**
- OS: [e.g. Ubuntu]
- Python version [e.g. 3.8]
- DeepTabular Version [e.g. 1.6.0]
- deeptab Version [e.g. 1.6.0]

**Additional context**
Add any other context about the problem here.
68 changes: 34 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,24 @@
<img src="./docs/images/logo/mamba_tabular.jpg" width="400"/>


[![PyPI](https://img.shields.io/pypi/v/deeptabular)](https://pypi.org/project/deeptabular)
![PyPI - Downloads](https://img.shields.io/pypi/dm/deeptabular)
[![docs build](https://readthedocs.org/projects/deeptabular/badge/?version=latest)](https://deeptabular.readthedocs.io/en/latest/?badge=latest)
[![docs](https://img.shields.io/badge/docs-latest-blue)](https://deeptabular.readthedocs.io/en/latest/)
[![open issues](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/OpenTabular/DeepTabular/issues)
[![PyPI](https://img.shields.io/pypi/v/deeptab)](https://pypi.org/project/deeptab)
![PyPI - Downloads](https://img.shields.io/pypi/dm/deeptab)
[![docs build](https://readthedocs.org/projects/deeptab/badge/?version=latest)](https://deeptab.readthedocs.io/en/latest/?badge=latest)
[![docs](https://img.shields.io/badge/docs-latest-blue)](https://deeptab.readthedocs.io/en/latest/)
[![open issues](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/OpenTabular/deeptab/issues)


[📘Documentation](https://deeptabular.readthedocs.io/en/latest/index.html) |
[🛠️Installation](https://deeptabular.readthedocs.io/en/latest/installation.html) |
[Models](https://deeptabular.readthedocs.io/en/latest/api/models/index.html) |
[🤔Report Issues](https://github.com/OpenTabular/DeepTabular/issues)
[📘Documentation](https://deeptab.readthedocs.io/en/latest/index.html) |
[🛠️Installation](https://deeptab.readthedocs.io/en/latest/installation.html) |
[Models](https://deeptab.readthedocs.io/en/latest/api/models/index.html) |
[🤔Report Issues](https://github.com/OpenTabular/deeptab/issues)
</div>

<div style="text-align: center;">
<h1>DeepTabular: Tabular Deep Learning Made Simple</h1>
<h1>deeptab: Tabular Deep Learning Made Simple</h1>
</div>

DeepTabular is a Python library for tabular deep learning. It includes models that leverage the Mamba (State Space Model) architecture, as well as other popular models like TabTransformer, FTTransformer, TabM and tabular ResNets. Check out our paper `Mambular: A Sequential Model for Tabular Deep Learning`, available [here](https://arxiv.org/abs/2408.06291). Also check out our paper introducing [TabulaRNN](https://arxiv.org/pdf/2411.17207) and analyzing the efficiency of NLP inspired tabular models.
deeptab is a Python library for tabular deep learning. It includes models that leverage the Mamba (State Space Model) architecture, as well as other popular models like TabTransformer, FTTransformer, TabM and tabular ResNets. Check out our paper `Mambular: A Sequential Model for Tabular Deep Learning`, available [here](https://arxiv.org/abs/2408.06291). Also check out our paper introducing [TabulaRNN](https://arxiv.org/pdf/2411.17207) and analyzing the efficiency of NLP inspired tabular models.

<h3>⚡ What's New ⚡</h3>
<ul>
Expand Down Expand Up @@ -48,10 +48,10 @@ DeepTabular is a Python library for tabular deep learning. It includes models th


# 🏃 Quickstart
Similar to any sklearn model, DeepTabular models can be fit as easy as this:
Similar to any sklearn model, deeptab models can be fit as easy as this:

```python
from deeptabular.models import MambularClassifier
from deeptab.models import MambularClassifier
# Initialize and fit your model
model = MambularClassifier()

Expand All @@ -60,7 +60,7 @@ model.fit(X, y, max_epochs=150, lr=1e-04)
```

# 📖 Introduction
DeepTabular is a Python package that brings the power of advanced deep learning architectures to tabular data, offering a suite of models for regression, classification, and distributional regression tasks. Designed with ease of use in mind, DeepTabular models adhere to scikit-learn's `BaseEstimator` interface, making them highly compatible with the familiar scikit-learn ecosystem. This means you can fit, predict, and evaluate using DeepTabular models just as you would with any traditional scikit-learn model, but with the added performance and flexibility of deep learning.
deeptab is a Python package that brings the power of advanced deep learning architectures to tabular data, offering a suite of models for regression, classification, and distributional regression tasks. Designed with ease of use in mind, deeptab models adhere to scikit-learn's `BaseEstimator` interface, making them highly compatible with the familiar scikit-learn ecosystem. This means you can fit, predict, and evaluate using deeptab models just as you would with any traditional scikit-learn model, but with the added performance and flexibility of deep learning.


# 🤖 Models
Expand Down Expand Up @@ -94,13 +94,13 @@ Hence, they are available as e.g. `MambularRegressor`, `MambularClassifier` or `

# 📚 Documentation

You can find the DeepTabular API documentation [here](https://deeptabular.readthedocs.io/en/latest/).
You can find the deeptab API documentation [here](https://deeptab.readthedocs.io/en/latest/).

# 🛠️ Installation

Install DeepTabular using pip:
Install deeptab using pip:
```sh
pip install deeptabular
pip install deeptab
```

If you want to use the original mamba and mamba2 implementations, additionally install mamba-ssm via:
Expand All @@ -120,7 +120,7 @@ pip install mamba-ssm

<h2> Preprocessing </h2>

DeepTabular uses pretab preprocessing: https://github.com/OpenTabular/PreTab
deeptab uses pretab preprocessing: https://github.com/OpenTabular/PreTab

Hence, datatypes etc. are detected automatically and all preprocessing methods from pretab as well as from Sklearn.preprocessing are available.
Additionally, you can specify that each feature is preprocessed differently, according to your requirements, by setting the `feature_preprocessing={}`argument during model initialization.
Expand All @@ -144,10 +144,10 @@ For an overview over all available methods: [pretab](https://github.com/OpenTabu


<h2> Fit a Model </h2>
Fitting a model in deeptabular is as simple as it gets. All models in deeptabular are sklearn BaseEstimators. Thus the `.fit` method is implemented for all of them. Additionally, this allows for using all other sklearn inherent methods such as their built in hyperparameter optimization tools.
Fitting a model in deeptab is as simple as it gets. All models in deeptab are sklearn BaseEstimators. Thus the `.fit` method is implemented for all of them. Additionally, this allows for using all other sklearn inherent methods such as their built in hyperparameter optimization tools.

```python
from deeptabular.models import MambularClassifier
from deeptab.models import MambularClassifier
# Initialize and fit your model
model = MambularClassifier(
d_model=64,
Expand Down Expand Up @@ -243,12 +243,12 @@ Or use the built-in bayesian hpo simply by running:
best_params = model.optimize_hparams(X, y)
```

This automatically sets the search space based on the default config from ``deeptabular.configs``. See the documentation for all params with regard to ``optimize_hparams()``. However, the preprocessor arguments are fixed and cannot be optimized here.
This automatically sets the search space based on the default config from ``deeptab.configs``. See the documentation for all params with regard to ``optimize_hparams()``. However, the preprocessor arguments are fixed and cannot be optimized here.


<h2> ⚖️ Distributional Regression with MambularLSS </h2>

MambularLSS allows you to model the full distribution of a response variable, not just its mean. This is crucial when understanding variability, skewness, or kurtosis is important. All DeepTabular models are available as distributional models.
MambularLSS allows you to model the full distribution of a response variable, not just its mean. This is crucial when understanding variability, skewness, or kurtosis is important. All deeptab models are available as distributional models.

<h3> Key Features of MambularLSS: </h3>

Expand Down Expand Up @@ -277,10 +277,10 @@ These distribution classes make MambularLSS versatile in modeling various data t

<h3> Getting Started with MambularLSS: </h3>

To integrate distributional regression into your workflow with `MambularLSS`, start by initializing the model with your desired configuration, similar to other DeepTabular models:
To integrate distributional regression into your workflow with `MambularLSS`, start by initializing the model with your desired configuration, similar to other deeptab models:

```python
from deeptabular.models import MambularLSS
from deeptab.models import MambularLSS

# Initialize the MambularLSS model
model = MambularLSS(
Expand All @@ -305,18 +305,18 @@ model.fit(

# 💻 Implement Your Own Model

DeepTabular allows users to easily integrate their custom models into the existing logic. This process is designed to be straightforward, making it simple to create a PyTorch model and define its forward pass. Instead of inheriting from `nn.Module`, you inherit from DeepTabular's `BaseModel`. Each DeepTabular model takes three main arguments: the number of classes (e.g., 1 for regression or 2 for binary classification), `cat_feature_info`, and `num_feature_info` for categorical and numerical feature information, respectively. Additionally, you can provide a config argument, which can either be a custom configuration or one of the provided default configs.
deeptab allows users to easily integrate their custom models into the existing logic. This process is designed to be straightforward, making it simple to create a PyTorch model and define its forward pass. Instead of inheriting from `nn.Module`, you inherit from deeptab's `BaseModel`. Each deeptab model takes three main arguments: the number of classes (e.g., 1 for regression or 2 for binary classification), `cat_feature_info`, and `num_feature_info` for categorical and numerical feature information, respectively. Additionally, you can provide a config argument, which can either be a custom configuration or one of the provided default configs.

One of the key advantages of using DeepTabular is that the inputs to the forward passes are lists of tensors. While this might be unconventional, it is highly beneficial for models that treat different data types differently. For example, the TabTransformer model leverages this feature to handle categorical and numerical data separately, applying different transformations and processing steps to each type of data.
One of the key advantages of using deeptab is that the inputs to the forward passes are lists of tensors. While this might be unconventional, it is highly beneficial for models that treat different data types differently. For example, the TabTransformer model leverages this feature to handle categorical and numerical data separately, applying different transformations and processing steps to each type of data.

Here's how you can implement a custom model with DeepTabular:
Here's how you can implement a custom model with deeptab:

1. **First, define your config:**
The configuration class allows you to specify hyperparameters and other settings for your model. This can be done using a simple dataclass.

```python
from dataclasses import dataclass
from deeptabular.configs import BaseConfig
from deeptab.configs import BaseConfig

@dataclass
class MyConfig(BaseConfig):
Expand All @@ -332,8 +332,8 @@ Here's how you can implement a custom model with DeepTabular:
Define your custom model just as you would for an `nn.Module`. The main difference is that you will inherit from `BaseModel` and use the provided feature information to construct your layers. To integrate your model into the existing API, you only need to define the architecture and the forward pass.

```python
from deeptabular.base_models.utils import BaseModel
from deeptabular.utils.get_feature_dimensions import get_feature_dimensions
from deeptab.base_models.utils import BaseModel
from deeptab.utils.get_feature_dimensions import get_feature_dimensions
import torch
import torch.nn

Expand Down Expand Up @@ -372,19 +372,19 @@ Here's how you can implement a custom model with DeepTabular:
return output
```

3. **Leverage the DeepTabular API:**
You can build a regression, classification, or distributional regression model that can leverage all of DeepTabular's built-in methods by using the following:
3. **Leverage the deeptab API:**
You can build a regression, classification, or distributional regression model that can leverage all of deeptab's built-in methods by using the following:

```python
from deeptabular.models.utils import SklearnBaseRegressor
from deeptab.models.utils import SklearnBaseRegressor

class MyRegressor(SklearnBaseRegressor):
def __init__(self, **kwargs):
super().__init__(model=MyCustomModel, config=MyConfig, **kwargs)
```

4. **Train and evaluate your model:**
You can now fit, evaluate, and predict with your custom model just like with any other DeepTabular model. For classification or distributional regression, inherit from `SklearnBaseClassifier` or `SklearnBaseLSS` respectively.
You can now fit, evaluate, and predict with your custom model just like with any other deeptab model. For classification or distributional regression, inherit from `SklearnBaseClassifier` or `SklearnBaseLSS` respectively.

```python
regressor = MyRegressor(numerical_preprocessing="ple")
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import torch
import torch.nn as nn
import torch.nn.functional as F
from deeptabular.arch_utils.layer_utils.sparsemax import sparsemax, sparsemoid
from deeptab.arch_utils.layer_utils.sparsemax import sparsemax, sparsemoid
from .data_aware_initialization import ModuleWithInit
from .numpy_utils import check_numpy
import numpy as np
Expand Down
Original file line number Diff line number Diff line change
@@ -1,49 +1,49 @@
from .layer_utils.normalization_layers import (
BatchNorm,
GroupNorm,
InstanceNorm,
LayerNorm,
LearnableLayerScaling,
RMSNorm,
)
def get_normalization_layer(config):
"""Function to return the appropriate normalization layer based on the configuration.
Parameters:
-----------
config : DefaultMambularConfig
Configuration object containing the parameters for the model including normalization.
Returns:
--------
nn.Module:
The normalization layer as per the config.
Raises:
-------
ValueError:
If an unsupported normalization layer is specified in the config.
"""
norm_layer = getattr(config, "norm", None)
d_model = getattr(config, "d_model", 128)
layer_norm_eps = getattr(config, "layer_norm_eps", 1e-05)
if norm_layer == "RMSNorm":
return RMSNorm(d_model, eps=layer_norm_eps)
elif norm_layer == "LayerNorm":
return LayerNorm(d_model, eps=layer_norm_eps)
elif norm_layer == "BatchNorm":
return BatchNorm(d_model, eps=layer_norm_eps)
elif norm_layer == "InstanceNorm":
return InstanceNorm(d_model, eps=layer_norm_eps)
elif norm_layer == "GroupNorm":
return GroupNorm(1, d_model, eps=layer_norm_eps)
elif norm_layer == "LearnableLayerScaling":
return LearnableLayerScaling(d_model)
elif norm_layer is None:
return None
else:
raise ValueError(f"Unsupported normalization layer: {norm_layer}")
from .layer_utils.normalization_layers import (
BatchNorm,
GroupNorm,
InstanceNorm,
LayerNorm,
LearnableLayerScaling,
RMSNorm,
)


def get_normalization_layer(config):
"""Function to return the appropriate normalization layer based on the configuration.

Parameters:
-----------
config : DefaultMambularConfig
Configuration object containing the parameters for the model including normalization.

Returns:
--------
nn.Module:
The normalization layer as per the config.

Raises:
-------
ValueError:
If an unsupported normalization layer is specified in the config.
"""

norm_layer = getattr(config, "norm", None)
d_model = getattr(config, "d_model", 128)
layer_norm_eps = getattr(config, "layer_norm_eps", 1e-05)

if norm_layer == "RMSNorm":
return RMSNorm(d_model, eps=layer_norm_eps)
elif norm_layer == "LayerNorm":
return LayerNorm(d_model, eps=layer_norm_eps)
elif norm_layer == "BatchNorm":
return BatchNorm(d_model, eps=layer_norm_eps)
elif norm_layer == "InstanceNorm":
return InstanceNorm(d_model, eps=layer_norm_eps)
elif norm_layer == "GroupNorm":
return GroupNorm(1, d_model, eps=layer_norm_eps)
elif norm_layer == "LearnableLayerScaling":
return LearnableLayerScaling(d_model)
elif norm_layer is None:
return None
else:
raise ValueError(f"Unsupported normalization layer: {norm_layer}")
Loading