I am honestly surprised how little SPAM there is on GitHub in general. Please don’t take that as a challenge!
I’ve encountered a single instance. On a library I maintain, I recently had a PR and a series of comments added on all my outstanding issues made by a person advertising their for-pay SaaS alternative. Every problem someone had “Just try my SaaS with this connector instead…”
I hid the comments, closed the PR asking me to link it from my README, while chastising the person a little. Looking at the user’s profile/history he had done similar to many other related libraries. I reported the profile to GitHub and the whole profile was gone with all his comments and tickets in a matter of hours.
I honestly feel a little bad about that resolution, here was a person who probably put a decent chunk of time into his SaaS and just made the mistake of advertising it in an obnoxiously intrusive manner. They had legit git repos, now gone. I could have seen a younger dumber myself in their shoes, especially as I literally built the same SaaS ten years ago, using my library, and didn’t get a single person to bite.
A HN recently complained about GitHub closing their account. The GitHub activity was similar: PR pointing to their product. Also 90% of HN comments were basically "interesting submission, also checkout <my url>". So the user was either dense or didn't understand that it is a kind of SPAM. Maybe too much beginner advise for solo startup founders is 'hussle' as in: place your URL anywhere you can.
> Maybe too much beginner advise for solo startup founders is 'hussle' as in: place your URL anywhere you can.
I did 4 years at a startup where my job was essentially this (we were a small news startup and I lead a small team of mostly college interns sharing links to our news videos with bloggers). _However_, there is a very crucial step I've rarely seen followed in the wild, and that is _actually_ engage in the conversation. "Cool, here's a URL to my product" is spam. But actually investing time into a thoughtful comment (and first evaluating whether your product is actually a good fit for the problem) is not, IMO.
"But I don't have time" is not a valid excuse here. If you don't have time to engage with your prospective customers and really understand their needs and where they're at, what are you doing communicating about your product?
In my experience doing this, you'll still get a mix of people who receive it well and some who don't. Keep a list of both. Then you know who to follow up with and who to avoid so you don't develop an antagonistic relationship with them.
Also, looking at the numbers, this didn't look like a particularly cost-effective approach to communication at the time. We were not at all playing a numbers game but were instead trying to build relationships with bloggers who would feel comfortable including our content where they believed it would genuinely contribute to their blog and their audience. Over time, maybe the time we invested up-front paid off. I left to other things before I got to assess that.
On the other hand, we were basically developing relationships with unpaid "influencers" around 2008-2012, so maybe we were on to something (in any case, the startup sold to E.W. Scripps for ~$35 million in 2014 and is still part of their portfolio today).
When I was doing this we'd still link the URL in the content of our reply. But it would be part of a natural reply that engaged with the surrounding conversation, not just mindless copy-paste spam like "Nice, btw check out my video: <link>"
We get an unbelievable amount of spam in the Tailwind CSS repo, especially in Discussions. Usually it’s people posting some Markov chain looking sentence to make it seem somewhat real, then a couple of hours later it is edited to add a link to some website.
yes, but what baffles me is that any spam, if ever, is directed at GitHub accounts directly, be it issues or contact addresses. The repositories themselves contain so many email addresses that are otherwise hidden from the platform. In a recent test of mine, out of 60 people that had their contact email hidden in their profile, I could still determine it by looking into commit authorships in at least some of their repositories. Why is noone targetting these? These are all active and often personal accounts, meant to not be discoverable.
IMO there's also no point in remaining silent about this, it's only a matter of time until the spam starts coming in over this channel. I'm pretty certain some day GitHub will cloak commit emails by default.
I have some email addresses only used to make commits. I have had a number of recruiters use that email address. So definitely somebody is looking at them.
You'd think by now some recruiters would've figured out the best way to find people who knows X is to search for commits in X and look at the attached email
Sadly there are vendors now that'll resell that info to recruiters so they can e.g. look at your linkedin and the tool will match it to their github and grab the email from a database built partially with commit emails.
20 or so recruiters over the last 3 years have contacted my github commit address.
GitHub gives you an E-Mail address for commits through the web UI (see the E-Mail tab in settings), which you can just use for your local commits as well.
I had completely opposite experience when I was actively participating in maintenance of medium size project. Waves of randomly generated pull requests and issues containing randomly inserted words with usernames that seemed to follow similar pattern. Dozens per day if not hour. Reported them countless times but there was no reaction from Github beside the initial automatic response, even after half a year those accounts where still there.
Only thing that somewhat helped was limiting PR and issues to contributors for a week or so (which is bad for users). Afterwards it was possible to turn them back on for everyone. But after few months of silence the bots where back.
The whole thing was slightly bizarre. And their goal was unclear. The messages never contained any actual ads or links. It looked like someone is trying to simulate real accounts behavior. Those accounts where doing similar thing to other real repositories and also making nonsensical activity in each others forks of real projects. Maybe the person was still testing how to avoid bot detection. Or maybe the goal was accumulating accounts which could later be used for selling stars.
A lot of filters for finding spam in systems look at values like the relationship between reports (for bad behavior) vs "good" contributions. Spammers will try to game the system by pushing out lots and lots of "good" contributions to then be able to sneak in the few really bad ones.
I woke up to some huge spam threads from Github this morning. Someone figured out how to notify everyone in a major repo (at least, I think that's what happened), and others piled on with nonsense comments and even some very disgusting images.
Sure, it's nice how little there is, but it's definitely there.
Seems like people have recently realized this is an attack vector. Starting May 31, I've received an immense amount of spam. GitHub, in their infinitesimal wisdom, make you log in to unsubscribe! Which of course I can't do on my phone. So I've taken to reporting their notifications as spam.
..... never underestimate people's desperation/shamelessness for free t-shirts.
8-9 years ago, at DigitalOcean's first SXSW booth, we were giving out free t-shirts that had DO's Sammy shark logo, and Linux's Tux penguin logo. The catch was people had to name the penguin on the t-shirt. The vast majority could not name Tux, and pled to get the t-shirt anyways.
This is what I don't understand, though. If you only want to meet the requirements with low quality PRs, you've always been able to achieve that with your own public repository.
I've mostly used Hacktoberfest as an annual opportunity to learn some new tech that I've read about but haven't yet had a chance to try. I start a new public repo and work through their tutorial doing PRs to commit chunks for each major section etc.
Open source. No spam for anyone. I learn something. Free shirt.
Practicing PR hygiene on your own repo if it's something you're not already familiar with is also a benefit.
I think your question was a pretty low barrier to winning, but I guess not. That's about when SXSW became more of a circus, filled with lots of "fake it 'till you make it" types. I would overhear people practicing their pitches (usually in an elevator!) and they were all different combinations of the same phrases that were trendy that year: "We offer crowdsourced coupons via influencers to create brand awareness.." "We work with influencers in offering coupons, to crowdsource brand awareness.."
I think that's mostly true, but I've been surprised by non-tech people pointing out and recognizing Tux on various things of my over the years since college (a T-shirt, my background on my laptop at one time, etc.). I think there's a certain segment of the population that will see an adorable penguin mascot, like it, and then look up what it is.
People will choose free stuff over cheap stuff all the time. Study below has an experiment (experiment 2). They first offered people either a Hershey's kiss for 1ct or a Lindt truffle for 15ct. The majority took the truffle. Then they decreased the prices to 0ct and 14ct respectively, so the same price difference. The Hershey's kisses suddenly sold better than the truffle
I dont know if it is mentioned in the paper, but one important thing is the friction of paying, even if it is 1 cent. It works the other way around too, sometimes people pay to avoid more friction in the future (like in those pay and you will get no ads, priority in the queue, etc, kind of services)
It'd be interesting to see if the people who made those low quality "amazing project" PRs ever went on to make more PRs, with useful contributions. Maybe the suggestion Coding With Harry made did ruin Hacktoberfest, but also maybe he actually suggested a simple way for people to get in to open source contribution and people did. Without following up we can't really tell.
A similar idea worked with my son (who is still in primary school). He found it very difficult to grasp the idea of Git in general and making small PRs like this helped him to warm up to the whole thing over a few days. Every beginner has to start somewhere I suppose.
If they're forcing their child to learn git I'd agree. If the child somehow got into contact with git (e.g. by getting into programming) and wanted to learn more about it, I don't see the harm.
The job market is a somewhat important part of life. Even if you prefer to hate work and coworkers on principle, it still helps pay for other things.
Specifics are a useful starting point for getting to fundamentals; trying to learn abstracted stuff without having anything to tie it too tends to be a bit tricky. Even for adults, which is why "here's a novel to help illustrate what we mean by these things" is a thing (for example, The Phoenix Project). It's also why math tends to start with arithmetic and memorizing times tables, and only after that gets to more fundamental stuff like algebra and then calculus.
It seems like premature vocational education. Vocational education is usually started as secondary or post-secondary education, not primary.
Starting vocational education so young feels like a slide back in time to the 19th century, except even then, apprenticeship usually began around 12-15 years old.
You are jumping to a conclusion that — just because the parent taught something about "git", they didn't get to do anything else. Where did you get that idea from? Nowhere did the parent say that this is the only thing their kid does. They may very well engage in a hundred more childhood activities in addition to this.
It seems to imply that the rest of childhood is lost because of git. You may want to correct or clarify your comment.
Btw, childhood is a lot about throwaway exploring many things where we don't have to invent a purpose or utility for every single thing. It's all about spare time and trying things out with no strings attached. It doesn't matter if in some of their spare time they watch TV together, or go to movies together, or build a robot together, or code together. I think you're getting downvoted not because whether the utility is right or wrong, but because utility doesn't need to matter.
I downvoted this comment as I disagree strongly with that statement:
> git has absolutely no meaning for a child, it's a professional tool used for work
It's not just a professional tool for work. It's a version control system to version whatever you might think needs versioning. What's this attitude that software engineering tools and skills are just valuable for work?
> you don't teach children how to operate a chainsaw, do you?
A chainsaw is very dangerous to the kid. My dad certainly told me how to use a hammer, handsaw and many other tools. We built a lot of things together like wooden remote controlled airplanes. It was certainly neither to train me to become a woodworker or aerospace engineer. And I will certainly build computer games with my kid some day which will involve git (or its future equivalent) as one of the first lessons.
> like do you really think children are bothered with not being able to version-control files?
I've used lots of "...-1.zip" files to preserve my files when learning coding for fun, so yes, someone showing me git (if it existed at the time) would help. It took me a few extra years to discover CVS/SVN. (the system itself was exciting)
> yeah, why not also teach them law and performing a surgery?
There are kids actually interested in law and finding the concept fun. You may be also shocked that I was really into accounting my allowance and spending in a double entry system. Surgery would be hard for practical/legal reasons, but I know two really young kids, one obsessed with the idea of performing medical procedures, the other with being a paramedic. (and not in a silly "I wanna grow up to be a doctor", but rather "I've found a first aid course for kids I want to do")
What I'm trying to say is, don't assume kids are pushed / forced to do things like that. Some of them really enjoy things adults find hard / boring / too work related.
I agree with everything viraptor said in the sibling comment and just wanna add:
> yes, work is still work, it doesn't matter if you're doing work as a fun hobby or as a job
No it's not work. You do work to get paid if you like a particular task or not. You do your hobby for yourself because you enjoy doing it and you just stop if you want to. I certainly won't force my kids to do that. I will try to get them into the subject with patience and let it go if they gave it a fair shot and don't like it. It matters a great deal if you do something for work or fun.
> like do you really think children are bothered with not being able to version-control files? is this an issue a lot of children are having?
Of course, everyone wants an "undo" button and lower consequences for experimentation. I really wanted one from the very first programs I wrote when I was 9 years old and don't have one. I had no intention of making a career of it and had no one to share programs with. Completely irrelevant.
First of all, teaching someone git isn't necessarily job training. We don't have the broader context. Maybe the child initiated it because they were interested in what their parents are spending their time on.
Secondly, teaching a child how to use git doesn't mean the child isn't learning other things too.
Thirdly, I learned version control (SVN, before git existed) as a child too. Not as young as primary school, but I learned it on my own because I was interested in it. It wasn't done in the context of job training.
Fourthly, your comparison with a chain saw is just plain vacuous. Git won't kill or maim you. Holy hell buddy.
So at least in my case, your speculation about the reasons for downvotes is wrong. And also, stop whining about downvotes. :)
It makes sense if the kid’s already into programming, and is starting to run up against the sorts of problems a VCS is designed to solve like “oh no it was kinda working and my attempts to make it work better broke it and I closed the editor so I can’t undo my way back out of this”.
A kid in first grade is not very likely to be having these kinds of problems, but a site like this full of professional programmers is certainly where you’re most likely to find parents whose kids are having these problems because they’ve been genuinely interested in learning what Mom and/or Dad do all day, and have been taught some of it.
Until a couple years ago, I didn't use Git at work (it was SVN at work). However, I used it all the time for personal projects.
Come to think of it, SVN was the same way; I used it for personal projects while using CVS at work.
Also, as an aside, my dad taught me how to use a chainsaw, including mixing the fuel for the two-stroke engine when I was 12. As an electrical engineer, he certainly didn't use a chainsaw at work.
Why not? I got my first programming manual (for C64 Basic, of course) when I was 7, read Stroustrup at 14, and wrote x86 assembly as 16. I started working with CVS only at university, and wished that I'd known about it years before.
But thank you for gratuitously denigrating my childhood.
Everyone is part of some tiny, tiny minority. It's the best thing about us. It doesn't matter if it's coding, aikido, collecting sea shells, or keeping ferrets.
Denouncing one such tiny minority just because you don't see the value in it makes you sound very closed-minded and bigoted.
That's a huge assumption here. At the base of what a human needs are the physiological needs. Do humans need art? Fun? Maths? Git? Where do you draw the line?
That's not the only context. I would likely appreciate git at 12yo. I definitely enjoyed programming already and I believe that was the time I learned more about maths to code some music visualiser plugins and was sending fun examples back and forth with a friend. Git would've been useful.
Some of the best experiences of my childhood were learning how to code, which ultimately turned into a career that has opened up more opportunities and experiences than I would have known otherwise.
I think your premise that coding is bad for kids is incorrect. Coding is not typing. It's an art of solving problems.
My childhood was coding, it was(is?) something I was good at and I thoroughly enjoyed doing.
My initial days were just writing random macros on Excel or coding out my maths problems on BASIC.
That has what shaped my career
i myself started at 12 (actually with BASIC!) and while there’s no denial that it has helped my career at later point in life, it has completely alienated me from the rest of people
my schoolmates still introduce me as “nerd” to other people (not in front of me of course)
So, if you discovered your (hypothetical or not) kid being interested in solving puzzles and programming, would you stop them? Instead of encouraging them to find some offline hobbies and spending time with other people to balance it out?
Also, people can be a**holes to others for any and no reason. Being different is just the most common and a seemingly rational one. It's an important life lesson to learn that it's often not possible to change people, and frequently not even worth trying. Adapting to them to fit their small worldview is even less promising.
I too learned BASIC when I was 12 and often felt alienated from my peers.
Turns out the alienation was from being a know-it-all, argumentative asshole and had very little to do with my programming skills. Most of my classmates probably would have found the programming somewhat interesting had I not lofted it over their heads as a badge of self superiority.
it just so happened that my knowledge helped me perform better at IT class than anybody else, so they got pissed at me for being able to comprehend something they themselves don't
For what it's worth, it sounds like you had a childhood that was very different than what kids today are growing up with, at least in my area.
By around 2014 the cool kids in my local high school were the nerds, and it's pretty much stayed that way. I worked with a neighborhood youth group where the kid who was at the center of everything was super into fortnite and coding. When he introduced himself to me the first thing he said was "I'm pretty much a nerd"—and he said it with pride! A few of the other kids felt the need to establish their nerd cred, too. This was in a neighborhood in the poorest part of a nearly-rural town. Their parents were mostly in trades, not tech.
"Nerd" has become a badge of honor that is sought after and claimed by kids, not a derogatory label assigned by others.
You are absolutely correct. We have a 3 dimensional learning environment outside that's objectively better for learning and playing than anything else.
That's what happened for me in a way. My first PRs were stuff like fixing dead links, typos, stuff like that. Now I regularly do PRs to fix stuff that's broken, either code or documentation.
That's true, hence my "in a way". My contributions and the "an amazing project" contributions are not the same. There are lots of open source projects that have some infrastructure in place to help new people get their first PR, which will be more meaningful than a spam PR. Here's one example: https://www.firsttimersonly.com/. I'll encourage anyone reading this that never contributed to a project to give it a try. People are very welcoming, helpful, and at least for me it makes me very happy every time I get a PR merged.
I'd generalize a bit and say that a lot of the Hacktoberfest spam PRs were limited to adding non-informational content to README files -- adding adjectives like "great" or "awesome", or phrases like "we hope you enjoy it" or "have a nice day", etc. Nothing that actually demonstrated any familiarity with the project or an intent to improve it, let alone anything that actually touched the code of the project.
Why not call this "How one company brought spam to GitHub"?
I had a suspicion something like this would happen the first time I heard of Hacktoberfest. If you ask people to make PRs to win something, with very little extra condition, they will. This is pretty much the situation of "you get what you measure". I can't find the reference now, but I recall a similar thing happening when Wikipedia asked for contributions in return for a chance to win something(?) --- tons of almost-entirely-plagiarised articles appeared from Indian accounts.
- Goodhart's Law, often stated as "When a measure becomes a target, it ceases to be a good measure".
- the Cobra effect. When the British government placed bounties on wild cobras, people started breeding cobras.
- Campbell's law: "The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."
Was this around the time people messed with that MIT pgp keyserver? (Blanking on a cite)
I've noticed these "Youtubers" doing a lot of things because they are not part of the hacker community, but adjacent to it, while being propelled to prominence and given $$$$$ to fund their lifestyle, it really grinds my gears.
This is overblown to blame the Youtuber - his statement was sufficient, and I say this as someone who also hates shitty PRs from the very low quality "tech talent" from certain geographies.
The bigger issue here is that there are literally millions of people in the developing world, often mistaken as being entirely from South Asia, but actually include most developing countries, where tech is suddenly accessible for them as a potential road out of poverty wages and up the earning ladder, but there is no quality control whatsoever for them or their learning.
I don't know what the solution is, but if the internet is truly for the masses and is connecting the world, then guess what, this is part of our world as well.
I did the same. I finished and pushed out a small static site that I still use occasionally. Haven’t participated since then, though I continue to push code and notes publicly.
I think the bigger issue is that the social graph is not easily accessable. It should be possible to automatically reject pull request from new accounts that only change one line of text.
That is true, but the difference between these "drive-by PRs" that change one line and useful PRs that do, is the latter probably has a lot more description and other accompanying detail.
I'm not familiar with rules, can i open repo with hactoberfest hashtag with bot that accepts all PRs? would that help to game the system and redirect most of that spam to one place? I don't want to encourage this behavior, but let's be honest it's hard to stop now, so maybe redirecting all harm to one please would help OSS projects?
I genuinely like the hacktoberfest shirts, so every October I just make a new GH Repo with some text files that I make 5 PRs for, this has worked for the last 6 years lol
Really not sure why people bother spamming other repos when you can just make your own
I'd be curious about how many developers actually start their Open Source journey with such a drive-by contribution, compared with how many do it only for the t-shirt. While every new productive contributor is an incredible win for an Open Source project, sifting and rejecting drive-by pull requests puts severe load on projects that mostly rely on voluntary contributions. All of this just to allocate some t-shirts. Not surprised if commercial projects will completely opt out of hacktoberfest if this continues.
The simplest solution would be to eliminate the rule that started to encourage this behavior. It's sad, but this is why we can't have nice things.
This is already done on an annual basis. Git repos spring up every October purely to accept PRs, no matter how trivial. They regularly make it to the top of the trending repos list.
I think DO excludes them from the final count though. It doesn't stop people from trying.
I also participated in Hacktoberfest ... I think it was 2018. I fixed some bugs in my friend's code that he wrote using OpenCV and a second PR was where I added some code to implement an AVL tree that I wrote for my lab. The other three commits were just low quality stuff from what I can remember. Got a nice T shirt though.
> But he made a mistake here and he's going to have to be responsible for the outcome, which so far isn't looking great.
I don't see how it's the guy's fault that people misuse this information. It's useful information, and the people making low-effort PRs are responsible for their own actions.
You have to add a pain threshold that people will only cross if they think it's worth it for the reward. If the terms included "the PR must be in good faith or constructive to the project and not merely fixing typos or other immaterial changes", people would assume they need to put in real changes, which would be more than it's worth to them for a t-shirt.
I think an even better goal for Hacktoberfest would've been documentation. Translate a page into a new language, or add documentation where it doesn't exist, and you get a t-shirt. Entirely automate the docs for code, get a plushie. Write HOWTOs, get $25 free DO credit.
In many cases, what I find lacking is not "unannotated class list" Doxygen documentation, or contextual examples, but proper documentation of pointer lifetimes, and invariants (for example the `snd_pcm_link` docs merely say the two streams will start and stop together, but doesn't say it fails unless you're linking two hardware devices with a shared underlying clock which will are supposed to remain in sync indefinitely), and for new project contributors, architectural/implementation and control flow documentation. Unfortunately invariants often live in the original programmer's head more than the code, pointer lifetimes are often nonlocal in the codebase, and architecture/implementation and control flow is inevitably nonlocal, requiring weeks or months of efforts for a motivated non-author to learn, far beyond the patience of someone fishing for Hacktoberfest T-shirts.
> You have to add a pain threshold that people will only cross if they think it's worth it for the reward. If the terms included "the PR must be in good faith or constructive to the project and not merely fixing typos or other immaterial changes", people would assume they need to put in real changes, which would be more than it's worth to them for a t-shirt.
You'd think, but never underestimate the pain that people will go through to get some minor profit for free. It becomes a numbers game. If you think your "immaterial" PR has a 1/10,000 chance of slipping by, you just make 10,000 immaterial PRs. There are a nontrivial number of people who will absolutely sit there and manually copy/paste 10K pull requests if they think it will get them $1-$2 worth of stuff. The smarter ones would find a way to automate it.
Seeing this guy's low quality PR and his viewers' low quality PR that followed reminds me of GPT3.
Sometimes we are all like GPT3. I think the difference is some werid meta-abstraction level mechanism to decision-making related to consciousness (in ways we do not understand) ...and sometimes people are just not thoroughly conscious so we see very little traces of this mechanism in their behaviours.
There was a group in Lansing pre-Covid that held a Hacktoberfest event. I didn't attend but talked with someone who did. They told me it was emphasized if you're not adding value don't do it. Hope this doesn't end up with Digital Ocean ending their support.
I never accused the article for being racist, even I was disgusted when this popular ytuber encouraged spamming pr on github just for a free shirt. The intention of re-posting an article ( which already reached front page of hn ) to hn when a similar post where an indian guy messing up big time by sending 400k notifications seems kind of suspicious.
It isn't exactly suspicious. I have noticed old news being posted again when it's related/similar to a currently trending news item. Let's just assume that it's an unfortunate coincidence that the people responsible for both fiasco were Indians.
However, these incidents do raise some important issues for both Github and India. This indicates that there is a prevailing situation in India that encourages young novice developers to come up with such low quality contributions to get a foot in the door in the industry.
I don't think it's that suspicious. Neither article highlighted the person's ethnicity in any sort of detail (I didn't even notice the ethnicity at all in the 400k emails post).
There are often multiple front page stories about mistakes or intentional problems caused by white dudes, and that's not generally seen as racism. Why should a similar coincidence with an equally-large population group be racism?