💾 Archived View for dioskouroi.xyz › thread › 29417135 captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2021-12-03)

🚧 View Differences

-=-=-=-=-=-=-

Pybaobab – Python implementation of visualization technique for decision trees

Author: sebg

Score: 152

Comments: 16

Date: 2021-12-02 15:16:31

Web Link

________________________________________________________________________________

ktpsns wrote at 2021-12-02 16:10:21:

Nice! For anybody who does not have access to the referenced IEEE paper (such as me), here it is on the website of its authors:

https://pure.tue.nl/ws/portalfiles/portal/3522084/6724346112...

harabat wrote at 2021-12-02 18:28:03:

And the GitLab repo:

https://gitlab.tue.nl/20040367/pybaobab

time_to_smile wrote at 2021-12-02 17:12:39:

I suspect decisions trees are still highly under utilized for optimizing any human in the loop processes that require an actual script that a person needs to follow.

It's not an uncommon problem that you're faced with needing to make a series of decisions in a business environment and old-school decision trees give a remarkably clear, readable output of the optimal way to make these decisions in as few choices as possible.

Any data scientists working on teams with call centers, sales teams or customer support people would likely find a surprisingly useful application of this mostly forgotten (other than a building block for RFs) tool.

tsumnia wrote at 2021-12-02 17:17:57:

Its not that decision trees are under utilized, but rather one of their disadvantages is that they can overfit to their training data very easily. This is actually one of the reasons Random Forests exist - just make a lot of overfitted decision trees and then vote for consensus.

The visualization looks great, though it did suffer visualizing the Random Forest. I could see using it for a single Decision Tree to convey the data's structure. Definitely going to use it for any DT slides I have to make.

prionassembly wrote at 2021-12-02 18:23:56:

People can also follow additive scores. This is what e.g. the DSM does.

A spreadsheet-like printed table where you mark items and sum scores in your head is probably easier to follow than a similarly-powered decision tree. Of course, you can't guarantee that a linear decision boundary exists, but in case there is one, the standard tools (Gauss-Markov/FWL theorems, p-values, etc.) are much, much more robust than CART or C4.5.

jmmcd wrote at 2021-12-02 22:08:02:

Agreed... but the decision trees we learn from data to do classification are quite different from the decision trees we design to formalise our processes. I think several replies here have missed the distinction.

__mharrison__ wrote at 2021-12-02 17:51:42:

I had a weird experience recently. We tore out our deck during Covid lockdown and I had a rusty nail puncture my skin. The doctor who treated it pulled out a sheet that was essentially a decision tree that determined that after cleaning the wound I just needed to be sent home with a lollipop. (Recent tetanus shot was sufficient.)

The strange part was when the doctor showed me the tree and asked me if I agreed... My response was "You're the doctor, you tell me?!"

riedel wrote at 2021-12-02 18:58:46:

Not really. If we are doing data science for many companies and explainability is an aspect we will likely go for decision trees if the lift for advanced models is minor anyways. We use this (not quite as pretty for visualization but extremely useful to get a grasp if the tree model):

https://github.com/parrt/dtreeviz

carabiner wrote at 2021-12-02 21:12:46:

What I'd like is some clever way of visualizing a randomforest in its entirety, rather than just showing the individual trees as they do in

https://gitlab.tue.nl/20040367/pybaobab/-/raw/main/images/ra...

. I have no idea how to go about doing this.

nestorD wrote at 2021-12-03 01:05:14:

I have been taught that the proper way is to use an algorithm that will summarize the random forest as a close-enough decision tree. I have also been taught that it can be very misleading...

Cyphase wrote at 2021-12-03 01:49:42:

This is always what I think of when I see "baobab":

https://wiki.gnome.org/Apps/DiskUsageAnalyzer

vletal wrote at 2021-12-02 18:45:44:

So was the Little Prince applying daily regularization to the baobabs on his planet?

stavros wrote at 2021-12-02 23:38:18:

Does anyone know of any software that allows people to edit decision trees? I have a use case where some non-technical people need to write a bunch of basically nested if-checks, and it would be good if they could do it visually.

layer8 wrote at 2021-12-03 02:05:51:

A mindmapper (e.g. XMind) could be (ab)used for that.

__mharrison__ wrote at 2021-12-02 17:45:57:

This is really pretty. Thanks for sharing, I will probably leverage this in my training!

morelandjs wrote at 2021-12-02 17:04:25:

This looks incredible! Excited to try this!