FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post here for help on using FreeCAD's graphical user interface (GUI).
Forum rules
and Helpful information
IMPORTANT: Please click here and read this first, before asking for help

Also, be nice to others! Read the FreeCAD code of conduct!
User avatar
easyw-fc
Veteran
Posts: 3630
Joined: Thu Jul 09, 2015 9:34 am

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by easyw-fc »

emacsimize wrote: Wed Jan 05, 2022 9:45 am So, I updated my Nvidia Linux drivers to the latest one and it stays the same.
FDL crashes instantly on importing the board with
also if you disable the 'removesubtree' function?
-
emacsimize wrote: Wed Jan 05, 2022 9:45 am I don't want to overstress this topic, since for me I can work with the weekly build. It's just now a personal interest why this is happening.
No problem, it is fine to me to investigate further even if the FC daily is performing well...
emacsimize
Posts: 22
Joined: Wed Dec 05, 2018 9:01 am

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by emacsimize »

also if you disable the 'removesubtree' function?
yes , still because it doesn't even come this far, it crashes already by importing the board.

"Funny" fact. I tested all this on a different PC and here also the weekly build crashes on added tracks :roll:

I will try to figure out the differences of the two systems
berka
Posts: 88
Joined: Sat Aug 22, 2015 9:08 pm

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by berka »

emacsimize wrote: Fri Jan 07, 2022 10:25 am I will try to figure out the differences of the two systems
I suspect you are barking up the wrong tree.

The behavior we both reproduced makes me think there is one of those memory usage bugs here. Stuff like use after free, buffer overflows, uninitialized memory.

You said you were going to Try building from source. If you are successful in building from source with debug symbols, you may be able to run in a debugger and pick the carcass at the time it happens. However, the causing bug may have done it’s damage long before you get the SIGSEGV.

In my initial experiments, I was trying to simplify my board to help debug. Sometimes I would change something, think I found the trigger, undo and redo and still get into the crash.
berka
Posts: 88
Joined: Sat Aug 22, 2015 9:08 pm

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by berka »

Wow. It's tough to figure out where the CPU is in a large project with limited symbols. No source lines (that I could find) in the released binary, so I was looking in the wrong area for a long time. It doesn't help that I haven't touched x86 assembly in decades, or that I don't use lldb much.

Looking at the source for the tip of the backtrace, I found this gem:
https://github.com/FreeCAD/FreeCAD/blob ... .cpp#L3221

Code: Select all

ViewProviderDocumentObject *DocumentItem::getViewProvider(App::DocumentObject *obj) {
    // Note: It is possible that we receive an invalid pointer from
    // claimChildren(), e.g. if multiple properties were changed in
    // a transaction and slotChangedObject() is triggered by one
    // property being reset before the invalid pointer has been
    // removed from another. Currently this happens for
    // PartDesign::Body when cancelling a new feature in the dialog.
    // First the new feature is deleted, then the Tip property is
    // reset, but claimChildren() accesses the Model property which
    // still contains the pointer to the deleted feature
    //
    // return obj && obj->getNameInDocument() && pDocument->isIn(obj);
    //
    // TODO: is the above isIn() check still necessary? Will
    // getNameInDocument() check be sufficient?


    if(!obj || !obj->getNameInDocument()) return 0;
That check at the end would only catch null obj pointers. It wouldn't catch the "invalid pointer" described in the comments above.
getNameInDocument does one more sanity check. However, it also only looks for a null pointer -- in a potentially-invalid object.

On x64, MacOS, I see this on one crash:

Code: Select all

(lldb) disassemble 
libFreeCADApp.dylib`App::DocumentObject::getNameInDocument:
    0x1015e1170 <+0>:  pushq  %rbp
    0x1015e1171 <+1>:  movq   %rsp, %rbp
    0x1015e1174 <+4>:  movq   0x300(%rdi), %rax
    0x1015e117b <+11>: testq  %rax, %rax
    0x1015e117e <+14>: je     0x1015e118a               ; <+26>
->  0x1015e1180 <+16>: testb  $0x1, (%rax)
    0x1015e1183 <+19>: jne    0x1015e118e               ; <+30>
    0x1015e1185 <+21>: incq   %rax
It's past the null-pointer check (<+11>) and tried to read memory at (%rax) which has an inaccessible address in it according to the debugger. No surprise there...

Select registers and the resulting try to access the memory in debugger below:

Code: Select all

(lldb) register read 
General Purpose Registers:
       rax = 0x40060b0100000000
...
       rdi = 0x0000000107df3e00
       rsi = 0x0000000107df3e00
       rbp = 0x00007ffeefbfaf30
       rsp = 0x00007ffeefbfaf30
...
       rip = 0x00000001015e1180  libFreeCADApp.dylib`App::DocumentObject::getNameInDocument() const + 16
    rflags = 0x0000000000010206
...

(lldb) memory read 0x40060b0100000000
error: memory read failed for 0x40060b0100000000
User avatar
easyw-fc
Veteran
Posts: 3630
Joined: Thu Jul 09, 2015 9:34 am

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by easyw-fc »

berka wrote: Sat Jan 08, 2022 8:57 am Looking at the source for the tip of the backtrace, I found this gem:
hi @berka
thanks for going deep inside this issue...
have you any insight in what I could do to avoid this?
Do you think is it related to the wb or FC or the graphic drivers?
I never had any issue in win or linux on my pcs .. (unfortunately I cannot more test in osx because my mac has gone in peace)
berka
Posts: 88
Joined: Sat Aug 22, 2015 9:08 pm

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by berka »

easyw-fc wrote: Sat Jan 08, 2022 10:37 am have you any insight in what I could do to avoid this?
Do you think is it related to the wb or FC or the graphic drivers?
If this is the problem, I think the blame lies deep in FreeCAD. It should be fixed by someone familiar with the whole stack. The fix may come at the expense of performance (based on that comment.)

These kinds of issues appear non-deterministic. You may be not seeing it because threads work differently in the OS or on the number of cores available (assuming multi-threaded access to objects.) The scariest part is that (based on comment) the crash only happens because the memory has been recycled and used for a different purpose already. In most cases, this method would follow the invalid pointers, do their checks on the destructed object and know nothing of it. No crash, but potential for silent damage to the feature, model, document or whatever.

It's not a fix, but you may be able to work around the issue on the side of KiCadStepUp. If you have a sense of what is causing this path, may be do some defensive calls for cleanup/recompute before it. The performance will undoubtedly suck, but at least it may be correct behavior.
berka
Posts: 88
Joined: Sat Aug 22, 2015 9:08 pm

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by berka »

FWIW, @realthunder and @wwmayer are in the 'blame' for these lines. It looks like the method got there as part of a refactor. I'm not sure what other history it has. They may be the best equipped to understand the implications and how to fix it so that condition in the note never happens. It's too late by the time this line is reached.

I think it's way past time we formalize this in a bug report.

Edit: Bah. Nevermind. Mantis says 0.20 bugs only. I'll have to try reproducing it there, or at least make an effort at diff 0.19 and 0.20 (though fix could be somewhere else.)
berka
Posts: 88
Joined: Sat Aug 22, 2015 9:08 pm

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by berka »

I was able to reproduce on 0.20 weekly build pretty easily. I opened a bug.
issue #4823

Anyone who can reproduce, especially in more "popular" Linux and Windows should add a crash report to support it.
I would consider any crash with tip of the backtrace showing these two (and may be libc more recently) related:

Code: Select all

App::DocumentObject::getNameInDocument() const
Gui::DocumentItem::getViewProvider(App::DocumentObject*)
User avatar
easyw-fc
Veteran
Posts: 3630
Joined: Thu Jul 09, 2015 9:34 am

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by easyw-fc »

berka wrote: Sat Jan 08, 2022 6:59 pm FWIW, @realthunder and @wwmayer are in the 'blame' for these lines. It looks like the method got there as part of a refactor. I'm not sure what other history it has. They may be the best equipped to understand the implications and how to fix it so that condition in the note never happens. It's too late by the time this line is reached.
wmayer wrote: ping
realthunder wrote: ping
would you please have a look at this issue?
berka wrote: Sat Jan 08, 2022 7:55 pm I was able to reproduce on 0.20 weekly build pretty easily. I opened a bug.
https://tracker.freecad.org/view.php?id=4823

Anyone who can reproduce, especially in more "popular" Linux and Windows should add a crash report to support it.
I would consider any crash with tip of the backtrace showing these two (and may be libc more recently) related:

Code: Select all

App::DocumentObject::getNameInDocument() const
Gui::DocumentItem::getViewProvider(App::DocumentObject*)
wmayer
Founder
Posts: 20243
Joined: Thu Feb 19, 2009 10:32 am
Contact:

Re: FreeCAD 0.19_pre Crashes (SIGSEGV) on MacOS; Triggered by kicadStepUp

Post by wmayer »

Anyone who can reproduce, especially in more "popular" Linux and Windows should add a crash report to support it.
I have tried it now approx. 10x and it has never crashed.
I would consider any crash with tip of the backtrace showing these two (and may be libc more recently) related:
It's obvious that there is a dangling pointer somewhere because it crashes inside DocumentObject::getNameInDocument() but its implementation is fine. So, this means the call of getNameInDocument() is done for a corrupted pointer to a DocumentObject that you can also see in DocumentItem::getViewProvider(). The passed obj pointer passes the null check and then causes a crash and this is due to an access-after-delete error as berka already found out.

The question is then what object exactly is accessed after deletion. Is it the DocumentObject itself or rather a DocumentObjectItem that allows the access to a DocumentObject.
Post Reply