Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
I discovered thousands of open databases on AWS (infosecwriteups.com)
283 points by boyter on Jan 31, 2022 | hide | past | favorite | 77 comments


Elasticsearch until recently did not nudge you to set up a username or password by default. I noticed the last time I installed it on a fresh instance that on completion of the install it gives you a warning about this and tells you what to do to set a password. That is a small improvement.

Most people would not have the service bound to a public interface, but for those who do for whatever reason have a set up where they are accessing it remotely, at least now there is a tip off that it is completely open to the world by default. This is different from pretty much any other service you might install. MariaDB for instance by default does not allow remote root login even if you change the config to bind it to 0.0.0.0.

A lot of people are just totally unaware of this issue. When I read about database leakage, I generally assume 90% probability it was elastic, these days. Defaults are so important.


Re Elasticsearch. It used to be much worse.

Even basic authentication was only available to those with an xpack subscription, it didnt have any security unless you paid money or used Nginx or another server to apply the auth and proxy the connection.

That didn't stop people from making instances available publicly, and it was only until Elasticsearch started hitting the news about all the open instances did they make it available in the free tier from May 2019 (6.8.0 / 7.1.0)


Honestly, a lot of these services should be secure and closed by default; they make them open by default to onboard people more easily when they're in the POC phase of their project. But there's nothing as permanent as a temporary solution.


Mongo for years defaulted to being open to the internet, with no authentication.


As, surprisingly, did so many other databases and operating systems. VAX/VMS was "FIELD/SERVICE" on all new systems for many years.Then all these systems got connected to DECNet/BITNET, etc and all heck broke loose. No real excuse for anybody to have ever configured their systems this way.


And it's probably the main reason MongoDB is a thing at all.


When I learnt Mongo, that was literally the first thing I had to lookup because the docs didn't explain it, or even mention it was insecure.

But once it was set up it gives you a login, and I realized it was completely public, anyone could just login there. It's a horrible default to have, but it's weird more people didn't realise how insecure it was.


This doc page has been up for as long as I can remember: https://docs.mongodb.com/manual/administration/security-chec.... It's also stated in the production notes: https://docs.mongodb.com/manual/administration/production-no....


I’m pretty sure amazon now scans for and warns you if you’re running an open elastic service on the public internet.


Yikes lol. I spin up tons of self hosted mysql boxes on public IPs but never would I ever expose 3306 or allow remote root lol. I just have a thin API in Java that feeds queries into it with parameterized statements


I think people would be surprised how often it's entirely deliberate. I've had to get told "this is why we used the cloud... so dinosaur sysadmins like you wouldn't slow us down" by many different companies where I've raised this, and I've had this debate in startup communities where "just open everything up" gets defended as "agile" and "modern devops".


I have stopped trying to be the "adult" in our company. I admit that I don't know everything, but two decades of experience has given me some insights. But when I make a suggestion with examples to back them up, my experience is often used against me. I have learned to let it go and allow them to fail on their own. I'm not going to work there for long anyway


So much of this, a company I worked for was like that, I brought up that credit cards were not encrypted/that the monitoring was everywhere but not on critical systems, that dev were using backdoors ( if dev return admin basically) etc, nothing changed. Recently they reported a data leak, no shit...


I think dataleaks aren't punished hard enough. If they did and stung the wallet really hard the industry would quickly adapt to better security requirements.


I'm no so sure -- many managers would just see it as an expeditious cost of doing business, unless it got to kill-the-company-now levels of fines.

I was once talking with an FCC agent about the status of some boards we were importing, and he told me about how Michael Dell would blatantly violate FCC regs with their computers, fight it for a while, then pay the fine out of "Federal Regulatory Account" -- he just happily violated the law and paid the fines enough to have a specific bank acct for that. He was not amused.

I'm pretty sure that it would require a demonstrably real threat of jail all the way up the management chain -- that might focus their minds a bit


No jail, crippling fines that if the behavior doesn't change would wipe them. One example would be sufficient and the industry would fix itself up.


>>unless it got to kill-the-company-now levels of fines.

I did mention that possibility - but to emphasize, it'd have to eb effectively death-penalty for the corporation - just wipe them out.

I'm still not sure that would be sufficient, because there are soooo many examples of horrible managers/executives just torching $$millions and failing up. If any of those executives ever got another job much above the level of sandwich-maker, it would provide no deterrent.

It really looks like at many levels, network and data security needs to be treated as a national security level problem, and any decision to fail to implement state-of-the-art level protections treated as a deliberate compromising of NatSec akin to leaving classified material or CUI (confidential unclassified info) on the sidewalk; not sure what the applicable law is, but that level of negligence is at least the start of serious trouble.


> I've had to get told "this is why we used the cloud... so dinosaur sysadmins like you wouldn't slow us down" by many different companies where I've raised this, and I've had this debate in startup communities where "just open everything up" gets defended as "agile" and "modern devops".

I guess there is some merit to the zero trust security model, since it should urge you to secure every bit of software that you run, without necessarily setting up bastion hosts/DMZs or other types of "server-based ACLs" like that: https://en.wikipedia.org/wiki/Zero_trust_security_model (since i've seen the fact that you have a "perimeter" get used as an excuse to get lazy about security aside from it)

Yet, i've found that it's just too risky, given how brittle most of our software, including its security functionality, actually is. Just the other day i had even the simple blog software that i use (Grav) fail enforcing its admin login functionality, something that seems incredibly simple on the surface and should be bulletproof, yet somehow wasn't: https://blog.kronis.dev/everything%20is%20broken/grav-securi...

Thus, personally i've found that what's worthy of consideration is taking a bit from both approaches. For example, running everything in containers with overlay networks (not even dedicated service meshes necessarily, even the way how Docker Swarm does it is pretty good: https://docs.docker.com/network/network-tutorial-overlay/) that can expose your apps/web servers on every node where necessary, yet keep your DBs constrained to running inside of the overlay network and thus remain inaccessible from the outside.

If and when you need to connect them, maybe run a container with SSH and WireGuard/OpenVPN that let's you forward ports and thus get access to the overlay network, or alternatively just use orchestrator functionality (for example, Kubernetes allows this, albeit i dislike its complexity otherwise) for this.


I failed hard on an edit to pamd, which resulted in any password being accepted for any user. Anything but no password.

Not something I would usually test for, but lesson learned. Thank his noodliness we don't have ssh exposed anywhere outside our network.


Like you, I've got lots of RDS instances (mysql). Generally they're accessible only to certain EC2 instances, so I just SSH into the appropriate EC2 to access the database. There are sometimes cases where it's expedient to expose them, like if I need to quickly test some backend code remotely before deploying it. I keep a security group that allows inbound on 0.0.0.0 :3306 and just assign it if I need to for a couple minutes.

Even if it's only open for a few minutes, there will be logs of malicious actors trying to brute force the DB's password. So how this comes as a surprise to admins is beyond me, but I haven't played with Elasticsearch at all.


Honestly, you'd do better scripting the aws cli to update the sec group with your IP address, which is how we've handled WfH for server SSH/plesk/SQL access.

It's like half a dozen lines of bash, I'll see if I've got it on my home pc.


SSM Port Forwarding and Tailscale are your friend. I'm using a mixture of both with most staff WFH now and their home IP addresses not being static.

Tailscale especially is an amazing product though I've only had it running at work and home for around a month. All that said, a simple script as you pointed out is also great :)


What’s fun is when those IPs remain in the rulesets even after they’re no longer used or needed - and the residential DHCP changes and now someone else has a direct line.

Or it points to an AWS IP that eventually gets handed back and recycled …


And that's where you both implement and act on security policies to mitigate and avoid this based on the threat model you're up against.


That's a cool idea! Never dawned on me to do that.


If you’re already ssh-ing to the database, why not tunnel the db connection over ssh?

It would probably be less effort than updating security rules each time


Ssh tunnels are so badass. I’ve heard of people performing encrypted db replication through them between data centers


I’ve witnessed somebody piping dd over an ssh tunnel to write blocks from one machine directly to another. Highly NOT recommended but still one of the neatest things I’ve seen. It worked.


That sort of thing can be used for making disk images. Restoring them would be the only way to verify them as checking the md5sum of the file is not possible without a reference backup and if you have a reference backup why would you be using dd through a pipe lol. Definitely a cool trick though.


Sounds like a neat party trick! (At least, if you've got a really stable connection.) Did they do an md5sum afterward, or how did they know it worked?


Fair question. I used to do that always. But now, if I'm logged into a bunch of services, AWS, mail, etc in the browser, I don't want to switch my IP address to test something for fear of getting bounced or locked out. Especially if I'm in a rush to fix something.


In that case you'd only have a different IP if you hit the localhost port of that tunneled connection, not everything on your computer. Unless you set up your browser and mail to proxy through that port there wouldn't be any issue with "switching IPs".


This isn't what happens - you do a port forward and you access your services through localhost:port. Nothing else goes through the tunnel. Endpoint services see the ec2 ip. There is no IP "switching" or risk of getting banned.

But if your ec2 service has security group access to multiple things it's extremely useful because you can hit localhost:xxxx for postgres, localhost:yyyy for elasticsearch, etc.


This wouldn't be efficient in my case. I'm testing a frontend that's served on localhost:80 and connects to a local backend also on localhost:80 that makes DB calls to some remote database on 3306. I'd have to modify the backend code to connect through the proxy port every time I wanted to test it. As it is the backend code on my computer is a mirror of the production environment. Normally if I tunnel I have it set up to forward everything through the tunnel except localhost.


In this case you'd simply do the tunnel for 3306:3306.

Your frontend and backend code would run locally and connect to MySQL through the proxy. You wouldn't have to change any backend configuration.


Plus this. I use ssh tunneling with private keys to access my Mariadb or MySQL instances.


I mean, especially on AWS, controlling access to 3306 is dead simple. and literally the one thing that anyone running a DB (which should be everyone) does to ensure that only certain machines, vlans etc can talk to 3306 on said DB instances...

Like literally ONLY this internal IP is allowed to contact the DB.

Also. as far back as 2014 we had AWS scanning all our machines constantly for open ports and alerts on such - we had to get an approval from within AWS to continue to portscan our own machines... they were pretty on top of port scanning behavior...

and I am surprised that the stance they take hasnt improved in all these years.


> For some companies, the DMARC and SPF records did not allow emails to be received in their mail servers from outside the company. What a mess, they cannot be reached by email at all.

Huh? Those protocols are just about sending emails - not receiving mail. Was this supposed to be MX records and firewall rules?


If spf only include the company ip and the dmarc policy is to drop unknown senders, then it won’t be received by the company’s mail server, it will be dropped.


I can't really follow this. If I am trying to email you, there is nothing that could be broken about your DMARC policy which would prevent your mail server receiving my email. What matters would be my own policy, as the sender.

Now you might prevent me seeing your replies, but that's another matter.


DMARC and SPF apply to the domain of the sender, not the recipient.

If the SPF records for domain.tld only allow their network's IPs, it means that they won't accept mails FROM domain.tld that come from outside their network.

Mails TO domain.tld should be accepted normally, no matter where they come from.

It seems to me that the researcher was trying to communicate with those companies by sending them emails using a FROM address of the company. That is, he was sending mails with a forged FROM and SPF/DMARC stopped them as it was designed to do.

Alternatively, maybe those companies were using some kind of email filtering service as "front MXes" that then tried to deliver to the company's own servers. In that case it would indeed be a mistake to not include the filtering service's forwarder's IPs in the SPF record, and they would indeed be unreachable from outside the company. In my experience this does happen from time to time, but it is very noticeable (your users will complain) and it gets fixed quickly. It's very stange to me that the researcher met this scenario multiple times...


I’d assume a number of these would be due to the good old “docker punching a hole through UFW”, which still hasn’t been fixed.


i'm very embarrassed to admit this, but i literally just discovered this. thought ufw was literally last defense. gotta run to fix this

(luckily, all just pet projects with no sensitive stuff - still…)


We had this issue previously and solved it with ufw by using the built-in DOCKER-USER chain that allows you to filter traffic before it hits Docker's own rules. Adding the relevant lines to UFW's before.rules means they persist across reloads.

For more on Docker's built-in chains, see: https://docs.docker.com/network/iptables/#add-iptables-polic...


Ha, this reminds me of the day I found out. I've port scanned my own servers ever since, just in case. It's such an easy footgun to fall for, they should really put a big scary warning somewhere on the Docker installation page for Ubuntu or something.


Why would you bind to a public IP and then use a firewall when you could just not bind to a public IP in the first place?


The general expectation when running an application is that when you misconfigure your system, the firewall will at least block incoming traffic. If I run ./program, the system will prevent me from fucking up too badly, like it should if I docker run vendor/program, that extra layer of defence is gone.

Most programs still bind to 0.0.0.0 and that's not a problem if your firewall works the way you expect it to. Docker's firewall rules overriding UFW's firewall rules without notice is unexpected for many people.

I know I fell for this one years ago when I first messed around with Docker, luckily I wasn't running anything important.


Because people use docker-compose.yaml files verbatim, or copy docker run command lines, which more often than not do “-p 8000:8000”, instead of “-p 127.0.0.1:8000:8000”. Therein lies the issue.


What's the difference? Because the latter is your own device and the former still let's you connect to it.

I'm kinda sleepy, so I'm sure it's a stupid question.

Edit: 127.0.0.1 is only getting in and nobody else.


Sometimes you want to allow traffic only from specific public IPs, or to rate limit traffic.


People are too worried about using k8s, containers and what not when they don't even know the basics of firewalls, setting basic permissions on their DBs, etc

Default secure configs help a lot. But ensuring at least a basic auth system is present should be the duty of every developer/devops, etc.


I think development of easy distribution tools like Docker and K8s (well, easy once it's finally set up right…) make it a bit too easy for some developers to push stuff into production.

These tools allow you to avoid the basics and go straight from local dev to production with just a few lines of YAML and a (probably pre-built) pipeline from somewhere. That's great if you're taking into account standard security configuration because you can deploy new tools instantaneously, but it's terrible if you've never thought about firewalls and ACLs before.

In a world where the solution to dependency management is "ship an entire Linux install", it's become too easy to mess up. If you don't understand exactly what your magic "just make it work" tools actually do and how they modify your system configuration, you shouldn't use them to run stuff in production.


Not really an issue as far as AWS is concerned, as they view this as on the customer as per the company's Shared Responsibility Model. [0] If you don't know how to manage infrastructure, physical or virtual, perhaps it's time to learn. And if you don't want to, you can string a bunch of SaaS solutions together and hope those companies have their act together internally.

[0] https://aws.amazon.com/compliance/shared-responsibility-mode...


This isn't a new thing sadly, numerous reports about stuff like this with both Mongo and Firebase have been reported before


Doubtful on Firebase. It's a managed service so it's not in the same category as these DBs, and is also authenticated by default.


I used to work on the Firebase realtime database - many people went to production with public reads/writes. We spent a lot of time trying to educate people not to do that.

Examples: https://www.comparitech.com/blog/information-security/fireba... and http://ghostlulz.com/google-exposed-firebase-database/


Relevant: buckets.grayhatwarfare.com

Public, as in unsecured, not with intent.


1. Isn't this a gap in the platform? I mean, shouldn't AWS automate this and alert (via AWS Config Rules/AWS Security Hub OR AWS Trusted Advisor) owners of these mis-configured resources?

2. Even better, should aws let customers proceed if they have mis-configurations?


#1 probably. #2 no. AWS is not responsible for how you setup your resources. They are very clear about that unless you use proserve or something. They are not your IT dept even for managed services.


Policing their customers' configurations would open them up for liability and embarrass them and their customers. But they do provide a ton of tools for customers to investigate their own security.

AWS basically rents digital chainsaws. You hope they know how to use it, but you know a fair number probably don't.


That sounds like an excellent subscription plan, I'm sure they sell those services already.

Of course Amazon assumes that you know what you're doing, so it won't include such scans by default. If you set up a container to be world accessible, you're probably intending to do so, otherwise you wouldn't expose it like that.

If you're unsure, you can always pay extra to have Amazon verify such things for you, but they're not your IT infra manager and neither should they be.


AWS very much assumes you know what you're doing. Products like CloudConformity exist to let you know when you're probably misconfiguring something.


AWS has a product called Security Hub. There's a rule which will alert for any RDS instances open to the internet:

https://docs.aws.amazon.com/securityhub/latest/userguide/sec...


SecurityHub will notify you if you launch the managed ES service outside a VPC. It’s still your responsibility to ensure that there is no ingress from the internet to the associated subnets.

If you’re launching EC2 and installing ES on it, you’re probably going to have to create your own Config rules for auditing.


There's been a lot of news about Mongo and ES, wait until people start searching for MS SQL servers with blank sa passwords. I've lost count of the amount of conversations I've had trying to shut that down with various organisations.


I’ve run responder on a digital ocean box (don’t ask) and seen large volumes of attempts to access a fake mssql server, the amount of zombie attacks is pretty eye opening.


> Successful SSH login using the private key

I'll be the grumpy uncle and tell people NOT to do this.


Logging in using someone's password, even if it got leaked like this, is considered a crime where I'm from. I'm sure whatever jurisdiction the author lives in has similar laws.

If you know you're not supposed to be inside someone else's computer system, you shouldn't access that system, or shut up about it at least! A crime with good intentions is still a crime according to the law!


Are you saying not to log ssh logins, or not to allow access via keys? What would you tell people to do instead?


I'm telling people not to use a private key they found to log into servers. It's tempting to see how far the rabbit hole goes but it can be very problematic.


Article says it's part of "Kubernetes logs for the entire production clusters (gathered through log collectors)", not that the author actually found private keys and used them.


Wouldn’t using Shodan have been much easier? Or does Shodan not scan from within AWS?


Shodan keeps a running record of what it finds on the Internet, so it's not scanning from within AWS. There are a few other services that do the same thing. I kinda like the idea of using masscan and doing the leg work yourself to get a feel for it, though.


What’s the use case for exposing these services to anything outside a VPC? With a hardened bastion or a vpn endpoint, you can ensure only authorized users can access. A firewall rule of “any AWS IP seems horribly flawed”.


People that know things are unlikely to accidentally do this. Unfortunately AWS accounts come with default VPCs that only have a public subnet, so unless you know what you are doing you are likely launching all resources with a public IP and likely allowing too much in your SG.


So we finally reached web X.0 where data is shared and open - we only got there accidentally


Terrible! And terrifying!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: