Some obersvations on Google's tracking and not always telling
I left my mom to housesit for a couple of weeks; when I came back, my youtube got totally polluted! More on that in a second.
I certainly would never log into youtube; it is way too much fun to see it figuring out who I am (well, not that much fun). Now, I've never used youtube much (an occasional search leading to some tech or cat video...), but lately I've been stuck alone, and needed stimulation. So I've been watching from a couple of TVs - often in the background while I wire-wrap some crap or code some other crap.
It is fascinating to watch how the algorithm converges on you. Now it's not rocket science to correllate watching habits from two sets on the same network - after a while suggestions spill across the two sets. It's also not that hard to throw my phone and other computing gear into the bucket.
It is also really interesting that has some caution built in. TVs in particular are slow to converge - there may be more than one user, and google doesn't just jump right in. It tests waters slowly -- throwing a video in and seeing if you will bite. After a confirmation, more stuff is introduced.
I imagine it keeps the history of every device, and correllates histories with humans. For instance I am alone at the house with my phone on, and I am sure google knows that (who doesn't have a phone in their pocket? My future self, I hope). Now the histories can be connected with a good probability. The two devices are used at different times? Is one in the bedroom perhaps? That's a lot of useful data. Oh, look - he turned off the TV and started an audio app with a timer. Good night.
So it is my conjecture that google keeps device histories and de-anonymizes them as time goes on. Time is on its side. A year passes after you used your work computer with lots of searches but no useful identifying information. Then you turn it on, log into a bank account, and bam. All that data is now attached to you.
Because your bank's website is a cesspool of web bugs! Ironically, I have a secured browser for normal use, and a completely insecure crappy browser for banking, because that's the only way I can log in! Arrrhh.
And that's not even using trickery. If you use chrome, google can probably uniquely identify you just by how you type! I remember someone issued a challenge to see if anyone can de-anonymize them on some forum, and a day later he was identified - his posts had an unusual transposition of letters, and it was correllated to some previous post with a similar typo. And that was just some guy without big data backing him.
And of course once you are de-anonymized, your browser's signature is instantly and forever connected to your dossier.
Back to my story: after two weeks with my mom, my youtube was unwatchable! I had finely tuned it to perfect geekery, with a good mix of electronics repair videos, retrocomputing, machining and manufacturing, and of course cats. When I got back, it was full of weird shit I can't imagine my mom watching - movie stars whos tits fell out during a shoot. Movie stars who actually had sex during a sex scene. 'Watch this video before licking a woman's vagina and you will be surprised'. Really, Mom?
Now to be honest, it's not that hard to amplify a trend, and I could not resist peeking, and soon there was nothing but that junk. I actually had to wipe my history and clear searches, and it quickly got back to reasonable.
Back in early 2000s I had a chat with a guy who worked at a marketing company. He said that (even with the shallow data pool back then) they could predict their customers' purchases with amazing accuracy - often before the customer was even aware of the upcoming purchase. At the time I dismissed it as bragging, but now I think he was telling the truth.
Humans know surprisingly little about themselves. Or should I say, humans don't want to appear predictable, but they are. One of my favorite moments in the show "Westworld" is when it is revealed that AI consciousness requires absurd amount of storage, while humans can be stored in a few petabytes. Hah!
Back when I was in High School teachers always scared kids with their "Permanent Record". Like, if you fail this test it will be on your Permanent Record! If you don't listen to me, I will make a notation on your Permanent Record! Needless to say, there was no such thing - and if there was no one ever looked at it and it was shredded. Now there is.
I realized that a long time ago and stopped using Google directly. But of course, they have other ways to track you - most sites on the net volantarily embed google bugs just to see how many hits they get. In exchange they give up their customer base - or reader base. That really sucks.
They know how much you earn, the price of your house, and they can tell that your water heater needs replacement. I am certain they can tell that mine is old - the data is out there, and I am getting enough 'look inside a water heater', 'drain your water heater', and 'amazing advances in water heaters' videos to be almost certain they know. They could probably tell me when it will fail based on the house water usage, which I am sure they have access to, and the number of people in the house over the past few decades.
But then again, your phone collects more data than you can imagine, especially in concert with other data. Even if your location services are 'off', the cameras, microphones, nearby devices, wifi routers, cell towers, accelerometers matched to other users (say you are on a bus or in a taxi), and so many things I don't know about - are constantly tracking your every step.
And adding it to your Permanent Record.
Remember the Nazi Enigma story? After breaking the encryption, Brits had to keep it quiet to avoid losing the golden goose. They had to let Germans bomb places, attack unexpectedly and otherwise think they were safe. Instead they had to invent a new science of gently screwing with statistics to make their wins look accidental; leave plausible explanations for their good luck, and often take a preventable loss...
I think the situation is similar here. A lot of entities know an awful lot about you, and using algorithms (please do not use the term AI loosely) can predict your behavior. But they won't tell you, and you certainly won't guess they know what they know.
Because it would be obvious, and very, very creepy.
And so when youtube, out of nowhere, pulls up a video about photography, it feels like a surprise (how the heck do they know?), but not really. Sure, I haven't searched for anything related to my photography hobby (currently inactive), but I must've a few years ago. Probably when I bought my last camera.
What is interesting is that the videos are largely about a point I've made in the past (not online, but I suppose phones and Alexas are listening) - it is more important to get good lens than an expensive high-megapixel camera. Not too unusual but still - all these videos about lower-end/older cameras outperforming high-end modern cameras? I guess they know which camera I got - I searched for manuals and such (It's Fujifilm X-E2, a truly amazing little camera, btw. Look controls!). Heck, I bought it online, duh. And based on that camera alone (and a couple of lenses on ebay), they can probably tell a lot about my photographic ideals.
What is surprising is that my new 'clean slate' youtube is not pushing right-wing crap I was getting before - probably from my weakness for bowling-ball cannon videos, which apparently only jerks watch. I suppose some sensitivity to polarized political views is required to avoid a riot.
Here is an odd thing: a day after a conversation with a somewhat bigoted relative, in which I cautiously listened his transphobic ideas but didn't say much (some things take time and finesse), a truly offensive ad started popping up urging me to boycot some company because of a transgender spokesperson. Boy did they get this wrong! I felt great, until I realized that now I have _that_ on my permanet record. Ah, screw them.