13 upvotes, 5 direct replies (showing 5)
I want to thank everyone who has been patient as we improve the removal pipeline. When Pushshift first started, it wasn't well known and we received maybe one removal request every other month. We now get hundreds per month and the previous method of manually processing each one was taking too much time.
To answer a few questions made in this thread:
1. How do you know I am the account owner?
A) Right now, we really have no way of verifying. At some point, we are going to have the ability for people to log into a portal via their Reddit credentials and instantly process the request. That will cover people who still own the account. For people who do not have access to their account, we will rely on an honor system until we can figure out the best way to balance people's privacy with malicious requests that doxx other people's accounts (which can be just as aggravating for someone who wants their data to be searchable).
What we may do eventually is allow people who can verify their account by logging in through a portal the ability to instantly request a removal and have it processed in a few minutes. For those who don't have access to their account, we might first verify via Reddit if their comments / submissions are still available and sync / mirror Reddit so that if their material is still available on Reddit, we will keep the material available via the Pushshift API. Of course, if there is an urgent request because of PII or something like that, we'll of course work with the person to get that removed as quickly as possible.
--------------------------------------------------------------------------------
2. What happens when a removal request is made?
A) Right now, we internally blacklist the account so that the data is not exposed via any public API. For full disclosure, we currently do not permanently delete any data unless there is a major issue involving PII, etc. While you have the right to request that people cannot search your comments and submissions via the public API, we reserve the right to keep data in our private archive so long as we never allow any data that you requested be removed get exposed through any public API endpoints.
--------------------------------------------------------------------------------
3. I've put my account in your form -- when is it getting removed?
A) We're almost done with the automated process to process removals in batches and should have the first batch completed this weekend at the latest. The goal is to first get to a point where removal requests get processed within 24 hours and then eventually provide an online portal that you can log into using your Reddit credentials so that your removal request can be processed in minutes. The online portal would use Reddit OAuth -- meaning we would never see your password. Basically it works by Reddit telling us, "this person is who they say they are and they have access to this account." Unfortunately, if someone ever hacks your Reddit account, they could request removal of content for that account.
--------------------------------------------------------------------------------
4. I'm afraid people might abuse this and cause my material to be removed -- what happens then?
A) When we get the online portal up, not only will you be able to request removal, but you will have the ability to remove the removal flag so that your content is then available again through the API.
--------------------------------------------------------------------------------
5. Will any of my data still be available in any form via your API once my removal request is processed?
Yes, but only via aggregations (like how many comments per second, minute, hour, etc.) were made to Reddit, how much activity takes place in a subreddit, etc. However, any comments or submissions you have made or the fact that you ever made them will not be available publicly. For example, if someone wants to know how many comments were made to Reddit last Tuesday, your previous comments will be a part of the sum of all comments, but that would be the extent of what would be available. Your actual comments / submissions would not be available via the public API endpoints.
--------------------------------------------------------------------------------
6. Can I get a copy of all my comments and submissions before the removal request is processed?
A) In the next several months, once the portal becomes available, you will have the opportunity to download all data that you posted and all comments that you made provided that you own the account (before the removal request is processed). There may be people who would like a copy of their Reddit history before their removal request is processed and we want to provide that tool to users in that situation.
--------------------------------------------------------------------------------
If anyone has any questions or concerns about this process, please feel free to raise your concerns here. We are doing our best to honor people's privacy while also providing a useful tool for researchers and people genuinely interested in finding topics that interest them more easily. We never intended this tool to be used to harass others but unfortunately we live in a world where some people just want to be genuine assholes.
Comment by Akaitori8 at 27/08/2021 at 20:26 UTC
19 upvotes, 6 direct replies
2. What happens when a removal request is made?
A) Right now, we internally blacklist the account so that the data is not exposed via any public API. For full disclosure, we currently do not permanently delete any data unless there is a major issue involving PII, etc. While you have the right to request that people cannot search your comments and submissions via the public API, we reserve the right to keep data in our private archive so long as we never allow any data that you requested be removed get exposed through any public API endpoints.
Great, so you STILL violate GDPR by keeping our data against our wishes...
Comment by JustHere2RuinUrDay at 29/12/2021 at 18:34 UTC
2 upvotes, 1 direct replies
I made a request for deletion a long, long, while ago, before the google form became a thing, back when you wanted us to comment our username under a reddit post and my stuff still isn't deleted and it still collects new posts and comments and makes them searchable on various sites. So, I filled out this google form yesterday - which is btw. a privacy nightmare as well - in the hopes that you might finally actually honor these requests and it's still not getting deleted. Is this a joke?
In my opinion this service shouldn't exist at all, since you're collecting and publishing data without notice or agreement. But now that it does the very least you could do is actually deleting that data upon request.
Right now, we internally blacklist the account so that the data is not exposed via any public API. For full disclosure, we currently do not permanently delete any data unless there is a major issue involving PII, etc. While you have the right to request that people cannot search your comments and submissions via the public API, we reserve the right to keep data in our private archive so long as we never allow any data that you requested be removed get exposed through any public API endpoints.
So all it takes is your servers getting breached. Glad something like that never happens, right?
My intention is to observe the laws governing the GDPR and make a good faith effort to follow the law to respect and protect the privacy of residents of the EU.
I do not think you're doing that. The GDPR allows you to collect data only after the user consented and only if there is legitimate interest in keeping that data. That is not the case with you collecting data from reddit users. It also gives EU citizens the right to have their data deleted, not to have their data made inaccessible to 3rd parties - and you're not even reliably doing the latter.
Comment by [deleted] at 27/08/2021 at 03:30 UTC
1 upvotes, 1 direct replies
[deleted]
Comment by [deleted] at 02/09/2021 at 10:46 UTC
1 upvotes, 1 direct replies
[removed]
Comment by parthivpatel94 at 07/11/2021 at 03:37 UTC
1 upvotes, 1 direct replies
Is this currently working? New batches are being removed? I’ve submitted few requests within the last 10 days. Didn’t got any response or action yet.