Feature-Kernel/experimental_results.tex at master · Optimization-Algorithms/Feature-Kernel · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
\section{Experimental Results}

In general \fk{} behaves very poorly. All the fallowing problems, taken from MIP lib, have been solved
using the method described above. To be more concise we  present only the results from \fk{}, because
the rest of the \ks{} is the same as in a standard implementation.

In order to find some patterns and to identify the actual quality of the method all the following tests was
run with the same configuration:
\begin{table}[H]
    \centering
    \begin{tabular}{|l|l|}
        \hline
        Parameter & Value \\ \hline\hline
        Count & 50 \\ \hline
        Min Time & 10 \\ \hline
        Max Time & 40 \\ \hline
    \end{tabular}
    \caption{\fk{} configuration}\label{tab:ks-config}
\end{table}

In this configuration are not present all the required parameters: the remaining are omitted because they are not
involved into the way \fk{} works.

In the following examples we use a simple schema to show the name of the instance, its URL and a graph showing
the convergence of the method. In the abscissa there is the \fk{} iteration, on the ordinate there is the \emph{usage ratio} (see \Cref{eq:usage-ratio}).
A \fk{} iteration can end in four different statuses (see \Cref{sec:submodel-result}), identified on each graph using four different colors, as described:
\begin{table}[H]
    \centering
    \begin{tabular}{|l|l|}
        \hline
        Status & Color \\ \hline\hline
        Infeasible & Red \\ \hline
        Linear Feasible & Blue \\ \hline
        Integer Feasible & Green \\ \hline
        Timeout & Black \\ \hline
    \end{tabular}
    \caption{Color legend for feasibility convergence graphs}
\end{table}

\newcommand{\Figure}{}
\newcommand{\Table}{}
\newcommand{\Instance}{}

\newcommand{\importExample}[2]{
    \begin{table}[H]
        \centering
        \begin{tabularx}{\textwidth}{l X}
            Name: & #1 \\
            URL:  & \url{#2}
        \end{tabularx}
    \end{table}
    \begin{figure}[H]
        \graphicspath{{test-results/}}
        \centering
        \includegraphics[width=0.95\textwidth, keepaspectratio]{#1-ratio}
        \caption{Feasibility convergence for #1}\label{fig:conv-#1}
        \graphicspath{{images/}}
    \end{figure}
    \input{test-results/#1-ratio}\unskip
    \renewcommand{\Figure}{\Cref{fig:conv-#1}}
    \renewcommand{\Table}{\Cref{tab:stats-#1}}
    \renewcommand{\Instance}{#1}
}
\importExample{air03}{https://miplib.zib.de/instance_details_air03.html}

In this first example one can see that \fk{} starts finding and integer feasible solution
only when the $ur > 0.851$. Also is not \fk{} has not been able to find a linear feasible model
until $ur < 0.768$. From the \Figure{} one can also notice that the major part of the generated
models are infeasible, so they are completely useless to identify information about the importance
of each variable. From \Table{} it is possible to see that $ur$ does not define a precise border between
Infeasible and Linear Feasible region or between Linear and Integer Feasible region.


\importExample{beavma}{https://miplib.zib.de/instance_details_beavma.html}

In \Instance{} the situation is a bit different: \fk{} has not been able to find a Linear Feasible model.
Besides that also in this case reading carefully \Table{} allows to understand that $ur$ does not define any
boarder. From \Figure{} one can see that \fk{} generated on 3 valid sub-models for \Instance{}, and all with a $ur > 0.969$.
This means that the value of the score computed by \fk{} is almost insignificant. An insignificant dataset will produce a
meaningless random forest that will bring to a meaningless kernel.


\importExample{mas74}{https://miplib.zib.de/instance_details_mas74.html}

\Instance{} is a lucky case: it is the only where \fk{} has been able to find an Integer Feasible model with a small $ur$.
This is the desired behavior of the method: continuous jumps between feasible and infeasible models in order to
find with variables are more \emph{important}. It is worth notice, from \Table{}, that the submodel never reached a maximum
$ur = 0.615$ so a large part of all possible results was not controlled by \fk{}. It is not possible know if exploring
the rest of the \emph{ur space} could allow to find a better initial kernel.

\importExample{neos-2294525-abba}{https://miplib.zib.de/instance_details_neos-2294525-abba.html}

\Instance{}, in contrast of the other instances described above, is considered \emph{hard}. This fact becomes evident when one
takes a close look to \Figure{}: there are a lot of black dots showing that a good number of instances was stopped due to  timeout and
not because they were infeasible. With \Instance{} \fk{} was able to find a clear boarder between infeasible and linear feasible
instances: this is probably consequence of the structure of the constraints. Although it could be possible to increase the
maximum time allowed to solve an instance, it could be counterproductive infect the \fk's goal is to reduce the time spent
building an initial kernel.


\newcommand{\SM}{\aleph^{m}}
\subsection{Why \fk{} performes so poorly}
For semplicty let us identify with $I$ the original instance, with  $\SM$ the set of
all the subinstances with $m$ variables enabled and with $\SM_{i}$ the $i$-th instance
with $m$ variables enabled. We also define $\phi^{m} \subseteq \SM $ as the set of feasible instances
in $\SM$.
It is implied that $m \leq n$.

At each iteration \fk{} fixes $\SM$ (where $m = \lfloor ur * n \rfloor$) and randomly picks an
instance $\SM_{i}$. Of course when $m$ is small (i.e. $ur \approx 0.2$) is very unlikely that any
model in $\SM$ is feasible.
It also easy to immagine that when $m$ is large (i.e. $ur \approx 0.9$) is very unlikely
that there is not any feasible instances in $\SM$. It is correct to say that all linear model have a minimal $ur$ below that
the number of enabled variable is not enought to satisfy the constraints of the problem. Let be
\begin{align*}
    \tilde{m} = \min m \mid \SM{}\ \text{contains a feasible instance} \label{eq:min-m}
\end{align*}
This means that any information obtained by \fk{} when $ur < \frac{m}{n}$ is completely meaningless: it is obvious
that the problem will be infeasible and all the selected variables will be penalized by the method, but it is incorrect
to infer this information because the model cannot be feasible.
Unfortunatly there are not easy ways to identify the actual $\tilde{m}$ so we are stuck into the guessing performed by
\fk{}.


Given a set of $n$ elements the number of subsets with $m$ elements is the binomial coefficent
of $n$ over $m$, so the cardinality of $\SM$ is
\begin{align}
    |\SM| = \binom{n}{m}
\end{align}

\begin{figure}[ht]
    \centering
    \includegraphics[keepaspectratio, width=\textwidth]{binomial-20}
    \caption{$|\SM|$ when $n = 20$}\label{fig:plot-binom-20}
\end{figure}

As one can see from \Cref{fig:plot-binom-20} $|\SM|$ is a large number even when $n$ is relatively small.
In general the number of feasible instance in $\SM$ (with $m > \tilde{m}$) is small so the probably of
randomly peaking a good $\SM_{i}$ is
\begin{align}
    \frac{|\phi^{m}|}{|\SM|}
\end{align}

In the applications this is obvious that $|\phi^{m}| \ll |\SM|$, in general $|\phi^{m}|$ is so small with respect to $|\SM|$
that it is safe to assume
\begin{align}
    \frac{|\phi^{m}|}{|\SM|} \approx \binom{n}{m}^{-1} \label{eq:prob-good-instance}
\end{align}

\Cref{eq:min-m} show us why \fk{} has so many issues in the initial phase while
\Cref{eq:prob-good-instance} shows why for some instances was so difficult to find feasbile sub instances until $ur$ was close to $1$.
In the beginnig simply there are no feasible instances to find, but \fk{} looks for them anyway. In the middle phase \fk{} must be
extremely lucky to find a feasible instance. Only in the final phase, with $m$ close to $n$ and $|\SM|$ reasonably small, \fk{} was able
to find a good instance by peaking a random one. \fk{} requires a good amount of luck in order to perform properly and in practice
this is not possible.

There are instances, like mas74 which is a knapsack problem, where $|\phi^{m}|$ is large and in this cases \fk{} could perform well. This are
cases where due to the form of the constraints it is hard to generate an infeasible submodel. Other instances, like neos-2294525-abba  which is
a set patitioning problem, have a very small $|\phi^{m}|$ and so \fk{} cannot perform well. This are instances where also with the full model
it is hard to find a solution.