💾 Archived View for dioskouroi.xyz › thread › 29417135 captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-12-03)
-=-=-=-=-=-=-
________________________________________________________________________________
Nice! For anybody who does not have access to the referenced IEEE paper (such as me), here it is on the website of its authors:
https://pure.tue.nl/ws/portalfiles/portal/3522084/6724346112...
And the GitLab repo:
https://gitlab.tue.nl/20040367/pybaobab
I suspect decisions trees are still highly under utilized for optimizing any human in the loop processes that require an actual script that a person needs to follow.
It's not an uncommon problem that you're faced with needing to make a series of decisions in a business environment and old-school decision trees give a remarkably clear, readable output of the optimal way to make these decisions in as few choices as possible.
Any data scientists working on teams with call centers, sales teams or customer support people would likely find a surprisingly useful application of this mostly forgotten (other than a building block for RFs) tool.
Its not that decision trees are under utilized, but rather one of their disadvantages is that they can overfit to their training data very easily. This is actually one of the reasons Random Forests exist - just make a lot of overfitted decision trees and then vote for consensus.
The visualization looks great, though it did suffer visualizing the Random Forest. I could see using it for a single Decision Tree to convey the data's structure. Definitely going to use it for any DT slides I have to make.
People can also follow additive scores. This is what e.g. the DSM does.
A spreadsheet-like printed table where you mark items and sum scores in your head is probably easier to follow than a similarly-powered decision tree. Of course, you can't guarantee that a linear decision boundary exists, but in case there is one, the standard tools (Gauss-Markov/FWL theorems, p-values, etc.) are much, much more robust than CART or C4.5.
Agreed... but the decision trees we learn from data to do classification are quite different from the decision trees we design to formalise our processes. I think several replies here have missed the distinction.
I had a weird experience recently. We tore out our deck during Covid lockdown and I had a rusty nail puncture my skin. The doctor who treated it pulled out a sheet that was essentially a decision tree that determined that after cleaning the wound I just needed to be sent home with a lollipop. (Recent tetanus shot was sufficient.)
The strange part was when the doctor showed me the tree and asked me if I agreed... My response was "You're the doctor, you tell me?!"
Not really. If we are doing data science for many companies and explainability is an aspect we will likely go for decision trees if the lift for advanced models is minor anyways. We use this (not quite as pretty for visualization but extremely useful to get a grasp if the tree model):
https://github.com/parrt/dtreeviz
What I'd like is some clever way of visualizing a randomforest in its entirety, rather than just showing the individual trees as they do in
https://gitlab.tue.nl/20040367/pybaobab/-/raw/main/images/ra...
. I have no idea how to go about doing this.
I have been taught that the proper way is to use an algorithm that will summarize the random forest as a close-enough decision tree. I have also been taught that it can be very misleading...
This is always what I think of when I see "baobab":
https://wiki.gnome.org/Apps/DiskUsageAnalyzer
So was the Little Prince applying daily regularization to the baobabs on his planet?
Does anyone know of any software that allows people to edit decision trees? I have a use case where some non-technical people need to write a bunch of basically nested if-checks, and it would be good if they could do it visually.
A mindmapper (e.g. XMind) could be (ab)used for that.
This is really pretty. Thanks for sharing, I will probably leverage this in my training!
This looks incredible! Excited to try this!