Yeah, this is the sort of thing that scares me about putting trust in signing up for a startup's service. Somewhere out there, there's a startup that has millions of user's personal information stored on a soon-to-be-disgruntled-former-employee's personal laptop.
Yeah I don't think a lot of people understand the lengths some companies go to, to control access to their systems (either because of risk aversion, regulatory requirements, or something else). It's on a different planet from a startup mentality of 'We hire trustworthy people, so all 10 employees have access to prod'. This is 'How do we protect customer data from 1 in 2000 IT staff across 100 countries?'
For example, I was working at a massive bank that was implementing an access control system for their servers that meant that admin's didn't have direct access. You had to
- submit a change ticket that went through all the approvals.
- This triggered a workflow through a web interface that opened an RDP or text terminal session to the server, for the appropriate person (OS, application, database, etc)
- The RDP session was recorded to video, the text session was logged.
- Once you logged out the password on the server was auto-rotated.
- The text-session was indexed and searchable. The software coordinating all this was able to match server logs against the video RDP session so you could search through the video ("show me the SQL Server commands that admin ran")
- On a regular basis, the server state would be reconciled against all the logged tickets for that server and discrepancies investigated.
Almost every bank.
My country assign a category to every bank. One of the requirements to earn a better category is the AUDITED production environment and its processes. a better category enables the institution to access bigger, riskier and more profitable business. That is: production environment practices affects business directly.
My employer has an ITIL system on place, and that is the only way to touch prod.
It is not for everyone, I admit!
I'm currently developing a side project that stores medical data, a big chunk of the design work I've done so far is how to work on systems where you never see production data (things like accurate faking of data, seeding correctly etc).
Just yanking a copy of production to a local machine is ridiculously and horrifically common at pretty much everywhere I've worked.
It's going well! Finally found a couple of co-founders, building a new version based on lessons learned from the alpha. The data model finally works the way I want it to work.
Spoke to a friend in local gov who put me in touch with the right people to deal with our equivalents however so I will be compliant, in addition since users self-elect to insert their data I've got my company's legal representative looking into it as well.
Medical data makes me nervous but this could help a lot of people, it's a product I'd have used immediately if it existed.
You don't have to fake data in order to analyze production data safely. It is possible to anonymize personal information, although it often requires a security expert to make sure you're doing it correctly.
Anonymizing PII is actually incredibly difficult. The problem is that the same things that create the valuable uniqueness you need to operate on the data are the things that are most sensitive.
There's no way to reliably anonymize medical data in the general case. PII/PHI often exists within free-text fields and file attachments. You can't remove it all just by searching and replacing.
A doctors/hospital only needs the flimsiest of "business justifications" to share/sell data to whoever they want to. What the data can be used for is limited, but it can be shared all over the place.
Source: Pulling this data is part of my job.
There are very strict laws regarding the use of PHI. It cannot be shared "all over the place". PII/PHI is regulated to the max; if you're working for a place that pulls or uses the data from places like I work for then you guys had better be darn sure that it isn't personally identifiable, or both your place, and my place, are in for a world of hurt at the next audit.
Yeah, I know. One thing I saw was Google soliciting hospitals for them to store DNA records there. The idea that Google could absorb thousands of people's DNA records without their permission via offering a storage service to a hospital...
That's terrifying.