Skip to content

C++ toolchain support#193

Draft
mayliex wants to merge 4 commits into
stephenrkell:masterfrom
mayliex:master
Draft

C++ toolchain support#193
mayliex wants to merge 4 commits into
stephenrkell:masterfrom
mayliex:master

Conversation

@mayliex

@mayliex mayliex commented Jun 7, 2026

Copy link
Copy Markdown

No description provided.

@difcsi

difcsi commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Hi,

Thanks for the PR!

if (const RecordType *RT = T->getAs<RecordType>()) {
  std::string name = RT->getDecl()->getNameAsString();
  if (!name.empty()) return "__uniqtype__" + name;
}

You are completely ignoring template names here, everything of A<B> will be __uniqtype__A, no matter what B is


The below is not necessarily about the PR itself, but C++ support in general


Also, there is an interesting discussion to be had about #21 in terms of C++: is the built-in operator new an allocator or just a wrapper?

C++ standard would say: unspecified, implementation-defined, whether we defer to std::malloc.

Typically, implementations do. Unfortunately, this is mostly built in, so we would need to transpile to something like

void* raw = std::malloc(sizeof(Point));
Point* p = new (raw) Point(42, 17);

The way I'd naturally treat this is that the new is a sub-allocaton of malloc. Coming from C, I see why some could say that the unique type of *raw is already that of Point, but from a C++ view, the object is only constructed within the new-in-place expression. It's as if we want to allocate Point p in a specific place in memory, and the pointer is merely our way of doing so. Though I see valid arguments for saying that the above is equivalent to

void init_point(Point* p, const int& x, const int&y) {
  p->x = x;
  p->y=y;
  // rest of the constructor body here
}

Point* p = std::reinterpret_cast<Point*>(std::malloc(sizeof(Point)));
init_point(p,42,17)

Your code seems to go with the first, and whilst I agree, I think this should be a conscious design decision.

There is also the case to consider where the new operator (via an override) allocates something other than the type-id within the new expression. An example of this would be a new operator stashing a cookie at the beginning of an object (Refcount anyone?).

struct Point {
    int x, y;

    using Cookie = int;
    static constexpr Cookie kMagic = 0xC0DE;

    static constexpr std::size_t kCookieSize = sizeof(Cookie);
    static constexpr std::size_t kOffset =
        (kCookieSize + alignof(Point) - 1) & ~(alignof(Point) - 1);
    void* operator new(std::size_t size) {
        char* base = static_cast<char*>(::operator new(size + kOffset));
        *reinterpret_cast<Cookie*>(base) = kMagic;
        return base + kOffset;
    }

    void operator delete(void* p) noexcept {
        if (!p) return;
        char* base = static_cast<char*>(p) - kOffset;
        ::operator delete(base);
    }

private:
    static Cookie cookie_of(Point* p) {
        return *reinterpret_cast<Cookie*>(reinterpret_cast<char*>(p) - kOffset);
    }
};

Point p = new Point(2,3);

The underlying allocation is clearly not of type point. But as

The new-expression returns a pointer to the object(s) of a pointer type derived from new-type-id or type-id. The program uses this pointer to access the newly allocated object

In this (overridden) case, new is clearly a sub-allocator of malloc

   malloc -> new char* -> new Point

which is fine. Porting C sizeof analysis can even make the char[] a bit more nuanced and recognise the stashed cookie type as an anonymous struct. (But I'd argue that is well within future work. No sane person codes like this normally: the example reinvents private fields for no good reason. Though I see how injecting said operator override into existing code might have benefits (GC & friends))

But in your code, new X is treated merely as X: we lose the type information of the preceding Cookie by not following through the new overrides and via the static cast. In C-world, this is kept as it is included in the malloc'd size

@stephenrkell

Copy link
Copy Markdown
Owner

Thanks @mayliex (and @difcsi for the useful comments)... this is great. I will look at this properly soon.

@mayliex mayliex marked this pull request as draft June 9, 2026 11:50
@mayliex

mayliex commented Jun 9, 2026

Copy link
Copy Markdown
Author

Hello @difcsi @stephenrkell ! I forgot to add that this project is a draft. Thank you for the very detailed comments! You raise great points, but I need time to research the issues raised.

@difcsi

difcsi commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

No worries! I see you are at King's: I'm around on campus (7th Floor N09) most weekdays in case you'd like to chat!

@difcsi

difcsi commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Added some more ramblings in #90

@stephenrkell

Copy link
Copy Markdown
Owner

As discussed, the template names can be covered by the usual $-based escaping used in uniqtype symbol names.

Placement new is really a different operation than allocation -- more like "change type of" -- so needs a separate table and a wrapper for the two-argument operator new. The wrapper calls into liballocs to change the type. But it is OK for now if we don't support placement new, I think.

See my comments in #90 for some funky partial approaches to the sizeof analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants