Skip to content

Fix hyperscan build when using non-hermetic LLVM toolchain#43479

Merged
phlax merged 2 commits intoenvoyproxy:mainfrom
krinkinmu:test-contrib-build
Feb 13, 2026
Merged

Fix hyperscan build when using non-hermetic LLVM toolchain#43479
phlax merged 2 commits intoenvoyproxy:mainfrom
krinkinmu:test-contrib-build

Conversation

@krinkinmu
Copy link
Contributor

@krinkinmu krinkinmu commented Feb 13, 2026

Commit Message:

When using non-hermetic LLVM toolchain the path to the nm and objcopy tools provided via environment variables to the hyperscan build script are not correct.

When using non-hermetic toolchain, make C++ toolchain make variables NM and OBJCOPY will have absolute paths to the host tools, so prefixing them actually turns correct paths to incorrect onces.

So one change that this PR does is to check if we are using a non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY) without modifications.

The other part of this PR modifies the hyperscan slightly to make build_wrapper.sh fail when one of the commands in the script fails.

The reason for this change is that currently, if this script cannot find nm and objcopy tools (i.e., when you don't have them installed - which could happen if you use non-hermetic compiler or, before this PR, because we added a prefix to the host tool paths that wasn't needed) it does not fail, but still does not produce quite correct result.

What you will have in the end is a failure during envoy-contrib linking due to linker failing to find definitions for a bunch of the symbols.

Let me try to explain what is going on there...

Hyperscan, specifically when we use fat runtime option, does something quite clever and quite terrible - it builds the same library source code several times enabling different optimizations and links all those together into the final hyperscan library.

It will build a version of the library assuming that AVX512 instruction set is available, another version of the library that assumes that AVC512VBMI instruction set is available and so on. Thus we will have multiple versions of the same library built with different optimizations.

Then it takes each of these versions and using nm and objcopy modifies the names of exported symbols adding a prefix to them to avoid name collisions later when it will link them together (remember all of them are built from the same source).

Finally, it links all those versions together into one library and provides a dispatcher function - this dispatcher function during runtime detects what instruction sets are actually available on the machine and calls an appropriate implementation for that instruction set.

The build_wrapper.sh script is what takes the compiled object files and renames symbols in it to avoid name collision.

The build_wrapper.sh is written in such a way that when nm or objcopy are not available - it does not fail. You can do a simple experiment for yourself and run the following script:

blahblah > /tmp/test.file
if test -s /tmp/test.file
then
        blahblah-again
fi

Even though neither blahblah nor blahblah-again command exists (and you will see an error about it in the output) the overall script status code when you run it will be 0 (e.g. echo $? will return 0).

A similar thing happens in build_wrapper.sh when it cannot find the objcopy or nm tool - the script does not fail, but it does not produce the object file we expect it to produce.

When later CMake links libhs.a out of the available object files it links everything that exists and produces libhs.a, which miss a bunch of symbol definitions, but otherwise is still a valid static library.

Because libhs.a build didn't fail properly, we proceed to eventually linking envoy-contrib (that's where hyperscan is used) and at that point linker discovers that we don't have all the symbol definitions available.

Why just not use hermetic toolchain?

I'd be happy to, but available published LLVM toolchains are quite limited and only support a few OS (note when it comes to hyperscan, it's specifically limited to x86 architecture, so architecture is not the issue here).

So if you're buildin on Linux other that RedHat or Ubuntu - you're basically out of luck at the moment unfortunately.

Many companies and communities publish custom built LLVM toolchains, but when they do, it's still typically done in format of a package for whatever is the package manager of the platform (e.g., deb, rpm, etc) and toolchains_llvm that we use to download hermetic LLVM toolchain does not support those.

Additional Description:

NOTE: I have a PR for the hyperscan library to change their build_wrapper.sh to be a bit more robust (intel/hyperscan#455), but judging by the history of the repository, it appears that they do not accept external PRs (or at least have not accepted them for a while), so I'm not hopleful.

NOTE: There are few other problems with using non-hermetic toolchains as well, but I want to discuss other issues separately from hyperscan. Hyperscan issue is a bit more straighforward and other issues with non-hermetic toolchains may require a bit more discussion and considering various alternatives.

Risk Level: Low
Testing: Manually that envoy-contrib builds successfully and that if nm and objcopy aren't found, the build will fail early; +ci
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

When using non-hermetic LLVM toolchain the path to the nm and objcopy
tools provided via environment variables to the hyperscan build script
are not correct.

When using non-hermetic toolchain, make C++ toolchain make variables NM
and OBJCOPY will have absolute paths to the host tools, so prefixing
them actually turns correct paths to incorrect onces.

So one change that this PR does is to check if we are using a
non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY)
without modifications.

The other part of this PR modifies the hyperscan slightly to make
build_wrapper.sh fail when one of the commands in the script fails.

The reason for this change is that currently, if this script cannot find
nm and objcopy tools (i.e., when you don't have them installed - which
could happen if you use non-hermetic compiler or, before this PR,
because we added a prefix to the host tool paths that wasn't needed) it
does not fail, but still does not produce quite correct result.

What you will have in the end is a failure during envoy-contrib linking
due to linker failing to find definitions for a bunch of the symbols.

Let me try to explain what is going on there...

Hyperscan, specifically when we use fat runtime option, does something
quite clever and quite terrible - it builds the same library source code
several times enabling different optimizations and links all those
together into the final hyperscan library.

It will build a version of the library assuming that AVX512 instruction
set is available, another version of the library that assumes that
AVC512VBMI instruction set is available and so on. Thus we will have
multiple versions of the same library built with different
optimizations.

Then it takes each of these versions and using nm and objcopy modifies
the names of exported symbols adding a prefix to them to avoid name
collisions later when it will link them together (remember all of them
are built from the same source).

Finally, it links all those versions together into one library and
provides a dispatcher function - this dispatcher function during runtime
detects what instruction sets are actually available on the machine and
calls an appropriate implementation for that instruction set.

The [build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh)
script is what takes the compiled object files and renames symbols in it
to avoid name collision.

The build_wrapper.sh is written in such a way that when nm or objcopy
are not available - it does not fail. You can do a simple experiment
for yourself and run the following script:

```
blahblah > /tmp/test.file
if test -s /tmp/test.file
then
        blahblah-again
fi
```

Even though neither `blahblah` nor `blahblah-again` command exists (and
you will see an error about it in the output) the overall script status
code when you run it will be 0 (e.g. `echo $?` will return 0).

A similar thing happens in build_wrapper.sh when it cannot find the
objcopy or nm tool - the script does not fail, but it does not produce
the object file we expect it to produce.

When later CMake links `libhs.a` out of the available object files it
links everything that exists and produces `libhs.a`, which miss a bunch
of symbol definitions, but otherwise is still a valid static library.

Because `libhs.a` build didn't fail properly, we proceed to eventually
linking envoy-contrib (that's where hyperscan is used) and at that point
linker discovers that we don't have all the symbol definitions
available.

NOTE: I have a PR for the hyperscan library to change their
build_wrapper.sh to be a bit more robust (intel/hyperscan#455),
but judging by the history of the repository, it appears that they do
not accept external PRs (or at least have not accepted them for a
while), so I'm not hopleful.

Why just not use hermetic toolchain?

I'd be happy to, but available published LLVM toolchains are quite
limited and only support a few OS (note when it comes to hyperscan, it's
specifically limited to x86 architecture, so architecture is not the
issue here).

So if you're buildin on Linux other that RedHat or Ubuntu - you're basically
out of luck at the moment unfortunately.

Many companies and communities publish custom built LLVM toolchains, but
when they do, it's still typically done in format of a package for whatever
is the package manager of the platform (e.g., deb, rpm, etc) and
toolchains_llvm that we use to download hermetic LLVM toolchain does not
support those.

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
@repokitteh-read-only
Copy link

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #43479 was opened by krinkinmu.

see: more, trace.

@krinkinmu krinkinmu requested review from soulxu and zhxie February 13, 2026 15:29
@krinkinmu krinkinmu marked this pull request as ready for review February 13, 2026 16:18
@repokitteh-read-only repokitteh-read-only bot added the deps Approval required for changes to Envoy's external dependencies label Feb 13, 2026
@repokitteh-read-only
Copy link

CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to (bazel/.*repos.*\.bzl)|(bazel/dependency_imports\.bzl)|(api/bazel/.*\.bzl)|(.*/requirements\.txt)|(.*\.patch).
envoyproxy/dependency-shepherds assignee is @agrawroh

🐱

Caused by: #43479 was ready_for_review by krinkinmu.

see: more, trace.

@krinkinmu
Copy link
Contributor Author

NOTE: I don't think that publish and verify is related to this PR, I can see other PRs in the queue facing the same issue.

@phlax
Copy link
Member

phlax commented Feb 13, 2026

yeah - that is github breaking stuff again

Copy link
Member

@phlax phlax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @krinkinmu

@repokitteh-read-only repokitteh-read-only bot removed the deps Approval required for changes to Envoy's external dependencies label Feb 13, 2026
@phlax phlax merged commit 8cbcadf into envoyproxy:main Feb 13, 2026
28 of 29 checks passed
krinkinmu added a commit to krinkinmu/envoy that referenced this pull request Mar 4, 2026
…nvoyproxy#43479)

Commit Message:

When using non-hermetic LLVM toolchain the path to the nm and objcopy
tools provided via environment variables to the hyperscan build script
are not correct.

When using non-hermetic toolchain, make C++ toolchain make variables NM
and OBJCOPY will have absolute paths to the host tools, so prefixing
them actually turns correct paths to incorrect onces.

So one change that this PR does is to check if we are using a
non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY)
without modifications.

The other part of this PR modifies the hyperscan slightly to make
build_wrapper.sh fail when one of the commands in the script fails.

The reason for this change is that currently, if this script cannot find
nm and objcopy tools (i.e., when you don't have them installed - which
could happen if you use non-hermetic compiler or, before this PR,
because we added a prefix to the host tool paths that wasn't needed) it
does not fail, but still does not produce quite correct result.

What you will have in the end is a failure during envoy-contrib linking
due to linker failing to find definitions for a bunch of the symbols.

Let me try to explain what is going on there...

Hyperscan, specifically when we use fat runtime option, does something
quite clever and quite terrible - it builds the same library source code
several times enabling different optimizations and links all those
together into the final hyperscan library.

It will build a version of the library assuming that AVX512 instruction
set is available, another version of the library that assumes that
AVC512VBMI instruction set is available and so on. Thus we will have
multiple versions of the same library built with different
optimizations.

Then it takes each of these versions and using nm and objcopy modifies
the names of exported symbols adding a prefix to them to avoid name
collisions later when it will link them together (remember all of them
are built from the same source).

Finally, it links all those versions together into one library and
provides a dispatcher function - this dispatcher function during runtime
detects what instruction sets are actually available on the machine and
calls an appropriate implementation for that instruction set.

The
[build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh)
script is what takes the compiled object files and renames symbols in it
to avoid name collision.

The build_wrapper.sh is written in such a way that when nm or objcopy
are not available - it does not fail. You can do a simple experiment for
yourself and run the following script:

```
blahblah > /tmp/test.file
if test -s /tmp/test.file
then
        blahblah-again
fi
```

Even though neither `blahblah` nor `blahblah-again` command exists (and
you will see an error about it in the output) the overall script status
code when you run it will be 0 (e.g. `echo $?` will return 0).

A similar thing happens in build_wrapper.sh when it cannot find the
objcopy or nm tool - the script does not fail, but it does not produce
the object file we expect it to produce.

When later CMake links `libhs.a` out of the available object files it
links everything that exists and produces `libhs.a`, which miss a bunch
of symbol definitions, but otherwise is still a valid static library.

Because `libhs.a` build didn't fail properly, we proceed to eventually
linking envoy-contrib (that's where hyperscan is used) and at that point
linker discovers that we don't have all the symbol definitions
available.

Why just not use hermetic toolchain?

I'd be happy to, but available published LLVM toolchains are quite
limited and only support a few OS (note when it comes to hyperscan, it's
specifically limited to x86 architecture, so architecture is not the
issue here).

So if you're buildin on Linux other that RedHat or Ubuntu - you're
basically out of luck at the moment unfortunately.

Many companies and communities publish custom built LLVM toolchains, but
when they do, it's still typically done in format of a package for
whatever is the package manager of the platform (e.g., deb, rpm, etc)
and toolchains_llvm that we use to download hermetic LLVM toolchain does
not support those.

Additional Description:

NOTE: I have a PR for the hyperscan library to change their
build_wrapper.sh to be a bit more robust
(intel/hyperscan#455), but judging by the
history of the repository, it appears that they do not accept external
PRs (or at least have not accepted them for a while), so I'm not
hopleful.

NOTE: There are few other problems with using non-hermetic toolchains as
well, but I want to discuss other issues separately from hyperscan.
Hyperscan issue is a bit more straighforward and other issues with
non-hermetic toolchains may require a bit more discussion and
considering various alternatives.

Risk Level: Low
Testing: Manually that envoy-contrib builds successfully and that if nm
and objcopy aren't found, the build will fail early; +ci
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

---------

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
phlax pushed a commit that referenced this pull request Mar 4, 2026
…43479)

Commit Message:

When using non-hermetic LLVM toolchain the path to the nm and objcopy
tools provided via environment variables to the hyperscan build script
are not correct.

When using non-hermetic toolchain, make C++ toolchain make variables NM
and OBJCOPY will have absolute paths to the host tools, so prefixing
them actually turns correct paths to incorrect onces.

So one change that this PR does is to check if we are using a
non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY)
without modifications.

The other part of this PR modifies the hyperscan slightly to make
build_wrapper.sh fail when one of the commands in the script fails.

The reason for this change is that currently, if this script cannot find
nm and objcopy tools (i.e., when you don't have them installed - which
could happen if you use non-hermetic compiler or, before this PR,
because we added a prefix to the host tool paths that wasn't needed) it
does not fail, but still does not produce quite correct result.

What you will have in the end is a failure during envoy-contrib linking
due to linker failing to find definitions for a bunch of the symbols.

Let me try to explain what is going on there...

Hyperscan, specifically when we use fat runtime option, does something
quite clever and quite terrible - it builds the same library source code
several times enabling different optimizations and links all those
together into the final hyperscan library.

It will build a version of the library assuming that AVX512 instruction
set is available, another version of the library that assumes that
AVC512VBMI instruction set is available and so on. Thus we will have
multiple versions of the same library built with different
optimizations.

Then it takes each of these versions and using nm and objcopy modifies
the names of exported symbols adding a prefix to them to avoid name
collisions later when it will link them together (remember all of them
are built from the same source).

Finally, it links all those versions together into one library and
provides a dispatcher function - this dispatcher function during runtime
detects what instruction sets are actually available on the machine and
calls an appropriate implementation for that instruction set.

The
[build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh)
script is what takes the compiled object files and renames symbols in it
to avoid name collision.

The build_wrapper.sh is written in such a way that when nm or objcopy
are not available - it does not fail. You can do a simple experiment for
yourself and run the following script:

```
blahblah > /tmp/test.file
if test -s /tmp/test.file
then
        blahblah-again
fi
```

Even though neither `blahblah` nor `blahblah-again` command exists (and
you will see an error about it in the output) the overall script status
code when you run it will be 0 (e.g. `echo $?` will return 0).

A similar thing happens in build_wrapper.sh when it cannot find the
objcopy or nm tool - the script does not fail, but it does not produce
the object file we expect it to produce.

When later CMake links `libhs.a` out of the available object files it
links everything that exists and produces `libhs.a`, which miss a bunch
of symbol definitions, but otherwise is still a valid static library.

Because `libhs.a` build didn't fail properly, we proceed to eventually
linking envoy-contrib (that's where hyperscan is used) and at that point
linker discovers that we don't have all the symbol definitions
available.

Why just not use hermetic toolchain?

I'd be happy to, but available published LLVM toolchains are quite
limited and only support a few OS (note when it comes to hyperscan, it's
specifically limited to x86 architecture, so architecture is not the
issue here).

So if you're buildin on Linux other that RedHat or Ubuntu - you're
basically out of luck at the moment unfortunately.

Many companies and communities publish custom built LLVM toolchains, but
when they do, it's still typically done in format of a package for
whatever is the package manager of the platform (e.g., deb, rpm, etc)
and toolchains_llvm that we use to download hermetic LLVM toolchain does
not support those.

Additional Description:

NOTE: I have a PR for the hyperscan library to change their
build_wrapper.sh to be a bit more robust
(intel/hyperscan#455), but judging by the
history of the repository, it appears that they do not accept external
PRs (or at least have not accepted them for a while), so I'm not
hopleful.

NOTE: There are few other problems with using non-hermetic toolchains as
well, but I want to discuss other issues separately from hyperscan.
Hyperscan issue is a bit more straighforward and other issues with
non-hermetic toolchains may require a bit more discussion and
considering various alternatives.

Risk Level: Low
Testing: Manually that envoy-contrib builds successfully and that if nm
and objcopy aren't found, the build will fail early; +ci
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

---------

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
nickshokri pushed a commit to nickshokri/envoy that referenced this pull request Mar 17, 2026
…y#43479)

Commit Message:

When using non-hermetic LLVM toolchain the path to the nm and objcopy
tools provided via environment variables to the hyperscan build script
are not correct.

When using non-hermetic toolchain, make C++ toolchain make variables NM
and OBJCOPY will have absolute paths to the host tools, so prefixing
them actually turns correct paths to incorrect onces.

So one change that this PR does is to check if we are using a
non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY)
without modifications.

The other part of this PR modifies the hyperscan slightly to make
build_wrapper.sh fail when one of the commands in the script fails.

The reason for this change is that currently, if this script cannot find
nm and objcopy tools (i.e., when you don't have them installed - which
could happen if you use non-hermetic compiler or, before this PR,
because we added a prefix to the host tool paths that wasn't needed) it
does not fail, but still does not produce quite correct result.

What you will have in the end is a failure during envoy-contrib linking
due to linker failing to find definitions for a bunch of the symbols.

Let me try to explain what is going on there...

Hyperscan, specifically when we use fat runtime option, does something
quite clever and quite terrible - it builds the same library source code
several times enabling different optimizations and links all those
together into the final hyperscan library.

It will build a version of the library assuming that AVX512 instruction
set is available, another version of the library that assumes that
AVC512VBMI instruction set is available and so on. Thus we will have
multiple versions of the same library built with different
optimizations.

Then it takes each of these versions and using nm and objcopy modifies
the names of exported symbols adding a prefix to them to avoid name
collisions later when it will link them together (remember all of them
are built from the same source).

Finally, it links all those versions together into one library and
provides a dispatcher function - this dispatcher function during runtime
detects what instruction sets are actually available on the machine and
calls an appropriate implementation for that instruction set.

The
[build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh)
script is what takes the compiled object files and renames symbols in it
to avoid name collision.

The build_wrapper.sh is written in such a way that when nm or objcopy
are not available - it does not fail. You can do a simple experiment for
yourself and run the following script:

```
blahblah > /tmp/test.file
if test -s /tmp/test.file
then
        blahblah-again
fi
```

Even though neither `blahblah` nor `blahblah-again` command exists (and
you will see an error about it in the output) the overall script status
code when you run it will be 0 (e.g. `echo $?` will return 0).

A similar thing happens in build_wrapper.sh when it cannot find the
objcopy or nm tool - the script does not fail, but it does not produce
the object file we expect it to produce.

When later CMake links `libhs.a` out of the available object files it
links everything that exists and produces `libhs.a`, which miss a bunch
of symbol definitions, but otherwise is still a valid static library.

Because `libhs.a` build didn't fail properly, we proceed to eventually
linking envoy-contrib (that's where hyperscan is used) and at that point
linker discovers that we don't have all the symbol definitions
available.

Why just not use hermetic toolchain?

I'd be happy to, but available published LLVM toolchains are quite
limited and only support a few OS (note when it comes to hyperscan, it's
specifically limited to x86 architecture, so architecture is not the
issue here).

So if you're buildin on Linux other that RedHat or Ubuntu - you're
basically out of luck at the moment unfortunately.

Many companies and communities publish custom built LLVM toolchains, but
when they do, it's still typically done in format of a package for
whatever is the package manager of the platform (e.g., deb, rpm, etc)
and toolchains_llvm that we use to download hermetic LLVM toolchain does
not support those.

Additional Description:

NOTE: I have a PR for the hyperscan library to change their
build_wrapper.sh to be a bit more robust
(intel/hyperscan#455), but judging by the
history of the repository, it appears that they do not accept external
PRs (or at least have not accepted them for a while), so I'm not
hopleful.

NOTE: There are few other problems with using non-hermetic toolchains as
well, but I want to discuss other issues separately from hyperscan.
Hyperscan issue is a bit more straighforward and other issues with
non-hermetic toolchains may require a bit more discussion and
considering various alternatives.

Risk Level: Low
Testing: Manually that envoy-contrib builds successfully and that if nm
and objcopy aren't found, the build will fail early; +ci
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

---------

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Signed-off-by: nick <nickshokri@google.com>
nickshokri pushed a commit to nickshokri/envoy that referenced this pull request Mar 17, 2026
…y#43479)

Commit Message:

When using non-hermetic LLVM toolchain the path to the nm and objcopy
tools provided via environment variables to the hyperscan build script
are not correct.

When using non-hermetic toolchain, make C++ toolchain make variables NM
and OBJCOPY will have absolute paths to the host tools, so prefixing
them actually turns correct paths to incorrect onces.

So one change that this PR does is to check if we are using a
non-hermetic toolchain and if so, just pass in $(NM) and $(OBJCOPY)
without modifications.

The other part of this PR modifies the hyperscan slightly to make
build_wrapper.sh fail when one of the commands in the script fails.

The reason for this change is that currently, if this script cannot find
nm and objcopy tools (i.e., when you don't have them installed - which
could happen if you use non-hermetic compiler or, before this PR,
because we added a prefix to the host tool paths that wasn't needed) it
does not fail, but still does not produce quite correct result.

What you will have in the end is a failure during envoy-contrib linking
due to linker failing to find definitions for a bunch of the symbols.

Let me try to explain what is going on there...

Hyperscan, specifically when we use fat runtime option, does something
quite clever and quite terrible - it builds the same library source code
several times enabling different optimizations and links all those
together into the final hyperscan library.

It will build a version of the library assuming that AVX512 instruction
set is available, another version of the library that assumes that
AVC512VBMI instruction set is available and so on. Thus we will have
multiple versions of the same library built with different
optimizations.

Then it takes each of these versions and using nm and objcopy modifies
the names of exported symbols adding a prefix to them to avoid name
collisions later when it will link them together (remember all of them
are built from the same source).

Finally, it links all those versions together into one library and
provides a dispatcher function - this dispatcher function during runtime
detects what instruction sets are actually available on the machine and
calls an appropriate implementation for that instruction set.

The
[build_wrapper.sh](https://github.com/intel/hyperscan/blob/master/cmake/build_wrapper.sh)
script is what takes the compiled object files and renames symbols in it
to avoid name collision.

The build_wrapper.sh is written in such a way that when nm or objcopy
are not available - it does not fail. You can do a simple experiment for
yourself and run the following script:

```
blahblah > /tmp/test.file
if test -s /tmp/test.file
then
        blahblah-again
fi
```

Even though neither `blahblah` nor `blahblah-again` command exists (and
you will see an error about it in the output) the overall script status
code when you run it will be 0 (e.g. `echo $?` will return 0).

A similar thing happens in build_wrapper.sh when it cannot find the
objcopy or nm tool - the script does not fail, but it does not produce
the object file we expect it to produce.

When later CMake links `libhs.a` out of the available object files it
links everything that exists and produces `libhs.a`, which miss a bunch
of symbol definitions, but otherwise is still a valid static library.

Because `libhs.a` build didn't fail properly, we proceed to eventually
linking envoy-contrib (that's where hyperscan is used) and at that point
linker discovers that we don't have all the symbol definitions
available.

Why just not use hermetic toolchain?

I'd be happy to, but available published LLVM toolchains are quite
limited and only support a few OS (note when it comes to hyperscan, it's
specifically limited to x86 architecture, so architecture is not the
issue here).

So if you're buildin on Linux other that RedHat or Ubuntu - you're
basically out of luck at the moment unfortunately.

Many companies and communities publish custom built LLVM toolchains, but
when they do, it's still typically done in format of a package for
whatever is the package manager of the platform (e.g., deb, rpm, etc)
and toolchains_llvm that we use to download hermetic LLVM toolchain does
not support those.

Additional Description:

NOTE: I have a PR for the hyperscan library to change their
build_wrapper.sh to be a bit more robust
(intel/hyperscan#455), but judging by the
history of the repository, it appears that they do not accept external
PRs (or at least have not accepted them for a while), so I'm not
hopleful.

NOTE: There are few other problems with using non-hermetic toolchains as
well, but I want to discuss other issues separately from hyperscan.
Hyperscan issue is a bit more straighforward and other issues with
non-hermetic toolchains may require a bit more discussion and
considering various alternatives.

Risk Level: Low
Testing: Manually that envoy-contrib builds successfully and that if nm
and objcopy aren't found, the build will fail early; +ci
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

---------

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Signed-off-by: nick <nickshokri@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants