Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Crowdsource by Google (crowdsource.google.com)
140 points by YPPH on May 15, 2022 | hide | past | favorite | 131 comments


One of the tasks in this app is audio validation:

> Audio validation: Listen to a short audio clip and determine if the pronunciation sounds natural in your language.

If this is something you're interested in doing, I recommend contributing to Mozilla's Common Voice instead. Common Voice builds freely licensed (CC-0) voice datasets that can be used by anyone, not just Google:

https://commonvoice.mozilla.org


[flagged]


Probably because while some data has been open-sourced, there isn't any statement on what other data be open-sourced, or a comittment to it by Google. Voice data hasn't been open-sourced.

Mozilla on the other hand is a non-profit which will release the result of your labor under a very permissible license.


Because most people have finite time and energy.


Finite time: dedicate it to the freely licensed org, not the company with 30++Billion in profits. They can pay people to do this


>....30++Billion in profits

earn badges and level up


Because calling it "free labour" just wouldn't attract quite as many people...

Perhaps ~2 decades ago I would've been a bit more optimistic, but now that I'm aware of the authoritarian dystopia which Google and the rest of Big Tech are trying to drive society towards, I feel obliged to point out that helping it achieve that goal is a really bad idea.


Free labor is all that is being pedaled in most of these projects... It really worries me about our future as a planet.

It's crazy how many people jump on doing surveys for free , and even in creating content for social platforms daily for free, because it's promoted by billion dollar profit companies that could easily hire and pay people while still making good profit.

The free labor economy is insanely out of control now, and accurate truths about the exploitation and data mining never will be discussed because these companies can easily suppress and downvote them. I don't do any surveys, haven't since before the Cambridge Analytica story.


*peddled


I gave away quite a bit of free labor to Google and niantic as they were launching ingress. If I had known that pokemon go would eventually be a thing, I might have thought a bit more about the power I had to shape where people would be walking every day for the next 10 years.


I don't know what you are ranting about but the term "crowdsourcing" is exactly what this tool is created to do. Don't use it if you don't want to.


It's still funny a company awash in money wouldn't just hire people to do this.

And I'm quite sure people criticizing this won't be using it :)


Because it's cheaper and it's clearly working. I doubt someone doing this as a full time job would be as good at these monotonous tasks than 10 times as many people who are doing it for a few minutes per day.


It could still be called "crowdsourcing" if it were compensated.


I was always a huge fan of Google from the late 90s onwards and was employed their from 2011 to 2013. I continue to use their services to this day. Unfortunately no matter how much you like something, when any one entity amasses this much power it's bound to make you a bit uncomfortable. I wish Google did more to anonymize and potentially distribute this kind of tech in a way that enabled anyone to run it. At the scale that Google is it's really hard to trust them with more and more personal data considering the employees could be quite literally anyone with any personal agenda that does not serve the greater good. They have no way of vetting employees to that degree and that's the biggest issue. I'm not just handing my data over to a machine or some homogeneous blob. I'm handing it to people I don't even know.

Ultimately crowdsource leads to UBI powered by Google. It's inevitable. But one entity controlling our lives is very hard to accept even if it is Google.


I agree with many of your points, but re: employees frankly internal systems at Google are pretty heavily audited and locked down not just in regards to PII but in regards to any queries against e.g. log data that try to narrow data down to small result sets. I can't speak to these ML data sets, but I was at Google 2011-2021 and in my time I believe I applied for and got log data access twice and the restrictions were serious, and the whole thing is heavily audited.


That might be true and I believe you but that doesn't change the fact that I'm more and more deeply uncomfortable handing over more and more personal data to a single large entity who basically knows as much about my life as I do. The single rogue employee might not be able to do much but inevitably all these companies suffer leaks, hacks and whatever else. Google is not immune to this. Machine learning is also a black box as much as anyone will tell you they understand it they really don't. So the potential for mishaps there are high. Again I do believe Google is the best technology company on earth, I continue to use their products, but I'm also more and more in deep contemplation on how to build viable alternatives to a lot of the products I use. I'll never be able to build things at their scale or even at the caliber of their UX but maybe there's a better alternatives as we go into ambient computing.


People are always worried about Google or FB doing something nefarious, but in all honesty it's going to be your local dentist or something that leaks all your private info and gets your identity stolen and life ruined.


Yes, precisely. With Google etc. the concern is less "rogue employee" and more... rogue corporate motives/incentives.


> Ultimately crowdsource leads to UBI powered by Google. It's inevitable.

UBI also provides income when the recipient performs no work at all for whatever reason. Why would a company support that?


The idea is that you will continue to use Google's services on the condition of receiving UBI. Crowdsource would continue to exist until UBI is in effect and then be used as a gamified way to earn even more money. Remember UBI is not infinite money. You get a baseline for paying your bills. Above and beyond you'll need to earn whatever you need for pleasure.


I know this is a hot take, but Google isn't literally evil. If there's a future when the entire world becomes a work-free utopia, they're not going to stop it just to make sure they can send children to coal mines while twirling their moustaches.

The fact that Google supports major open-source projects today is a good indicator that they're perfectly capable of participating in widely beneficial projects that only benefit them indirectly.


UBI is the "forever ticket" that insures the status quo when UBI is institutued will be the people in power forevermore. UBI is the means in which the 1% seal their and our fates forever, because there is zero incentive to change or improve a system where all power and decisions are performed by ignored automation for a powerless, uneducated and education-incapable 99%, while the 1% in power wallow in excess.


Let's relieve humanity from the hard-scrabble "earn a living" conditions because A) we can, and B) not doing so is cruel.

Maybe we slide into complacency and sloth, but I doubt it.

Personally, I think it would free up more time for things like parenting and education, and working out political differences with words, not weapons.


How would UBI make the power gap larger than it already is?


UBI creates a welfare state with a culture that does not need to work to live, so many would-be STEM professionals simply become tailgate partiers for life. A culture is created as a result of UBI that is detrimental to education and scientific advances.


What if it wasn’t enough to live on, but enough to significantly impact the ability for the poorest to survive? Say $3600/yr = $300/mo.

Don’t you think that could increase the pool of STEM candidates from poor families without creating a large new subculture like you’re describing?

That being said, I don’t think even a greater UBI would decrease STEM candidacy. UBI would have to be massive before people stop wanting to be in STEM. If anything, the problem would be that we lose the bottom rung of unskilled labor.


I'd love to see an outcome as you describe. I'm doubtful.


How will the 1% in power get others to work for them?


There's little or no reason for most folks to work, or soon will be.

It's no longer a market dominated by labor. Factory jobs are a nearly-extinct fraction of what they once were. Office jobs have dried up and pushed 30M people into service (flipping burgers) at a fraction of what they used to earn.

When service jobs become automated (happening at a breakneck pace?) then what? "The workers own the means of production" becomes an empty phrase when there are no workers in the plant.

Lots of hyperbole about changes caused by UBI, particularly in apocalyptic scenarios. But the alternative may be, let them starve? Because that's a conspiracy-theorists preference?


How do they now? Starvation, place to sleep, criminalizing homelessness. With UBI, the excuses cease and criminalization doubles down. UBI is a trick, folks, do not fall for it!


> How will the 1% in power get others to work for them?

Tens of millions of Americans can effortlessly live in luxury relative to our ancestors. Most of us on this forum could retire to a poorer country and live comfortably off savings alone. We don’t.


Yes, my question was intended to lead to the answer of “the 1% having to pay them”. As in UBI will give lower wage workers a better negotiating position from which to extract more wealth/power from the 1%.


No, that is not how it works. UBI removes any negotiating position; what are the UBI recipients going to do, strike their non-existing jobs? Riot, and trigger the 1% police-thugs? UBI is a terrible idea, and how a human mass execution is setup.


I was expecting to be pessimistic, but Google actually releases the datasets under a permissive license (CC-By 4.0). Awesome!

https://research.google/tools/datasets/open-images-extended-...

https://github.com/google-research-datasets/hiertext


It's nice that Google is releasing something, but the 3 datasets (https://crowdsource.google.com/about/open-source/) only cover only a fraction of the Crowdsource tasks:

  Food compare: Compare the characteristics of two food images.
  Response rating: Evaluate the natural-ness of a bot response.
  Audio donation: Record your voice to improve speech technology.
  Food facts: Tell us if a food dish has particular characteristics.
  Food labeller: Tell us what food an image contains.
  Semantic similarity: Judge whether two phrases have the same meaning.
  Chart understanding: Judge whether charts are understandable and trustworthy.
  Glide type: Glide your fingers on the keyboard to type the text that you see.
  Audio validation: Listen to a short audio clip and determine if the pronunciation sounds natural in your language.
  Image label verification: Tell us if images are tagged correctly.
  Image capture: Collect and share photos of your part of the world.
  Translation: Translate phrases and words into different languages.
  Translation validation: Select which phrases are translated correctly.
  Handwriting recognition: Look at handwriting and type the text that you see.
  Sentiment evaluation: Decide if a sentence in your language is positive, negative or neutral.
  Smart camera (Android Lollipop 5.0+ required): Point at an object and see if the camera can guess what it is.
https://play.google.com/store/apps/details?id=com.google.and...

Even in those 3 datasets, Google does not disclose the proportion/percentage of the crowdsourced contributions that are released publicly. I would not contribute to Crowdsource with the expectation that my contributions would help build a freely licensed dataset.


"Google shares some Crowdsource data via open-sourcing"

Looks like the releases will only be partial. This kind of data should be collected bu a nonprofit for the benefit of everyone.


Also, what the crowd gets is “cool badges”

Who even comes up with these things?


Marketing people who understand how the average citizen works? I'm pretty sure they'll find thousands to low millions of volunteers.


I was expecting to be pessimistic, and I am, because I thought that datasets that actually matter probably won't be released and there is no guarantee that this trend will continue. Please don't trust the giant.


Yeah, most of the 'Google Research Datasets' github account has super boring datasets. There's no way they'll help out with the actually interesting datasets (this is just PR).


That blog post doesn't state that _all_ collected data through Crowdsource will get published.


Do you really want raw, unmoderated user generated content?


Yes! They can provide both filtered & unfiltered data.


I imagine there would be some serious legal concerns with releasing the raw data (I work for Google, but no special insights into this project)


If you're not gonna release the data, then don't do the project in the first place (especially not under the implication that it'll be shared with everyone). Saying in hindsight "oh we can't release all the data due to sensitivity issues" is just weaseling your way to keeping the most valuable data to yourselves as if the stated issues couldn't have been predicted.


Why? Users volunteer the data. Just ask them if they're ok with it being public.


I think that approach is akin to the honor system and based on my experiences on the internet I fear that it won't scale well. For some types of images, just because the uploader is ok with the file being shared doesn't mean it's a good idea to redistribute it. For a bland example, think of a photo where the uploader doesn't have copyright. I'm sure you can imagine what would happen if someone on the seedier parts of the internet says "hey, if you upload your images to this website, Google will host it for free forever!"


One negative, I guess, would be uncovering the moderation algorithm so a malicious user could circumvent it.

Another negative would be release of Bad Words or illegal content submitted by malicious users. Depends on the task.

But the actual raw data would be of more use to researchers than one cleaned from an output of algorithms. Perhaps there could be a program for educational researchers?


Would this be compatible with importing into OpenStreetMap and similar open data projects?


OpenStreetMap does not host images, other projects (close to OSM) do.


In addition CCBY 4.0 isn't compatible with OSM without a waiver for some of the terms. :(


Sure, that too.


Btw what do those 'data cards' actually do? Can you get sued for going against it? Does it conflict with the permissive license or does that take precedence?


It's a strange ML term to describe the data's metadata.


I guess mechanical Turk was too expensive for Google. I’m all for collaborative, open source projects but it seems kind of skeevy for a company as profitable as Google to basically set up free labor for themselves.

I hope there are some fun 4chan stories of people breaking classification.


How naive can people be? Help Google build AI? Google is very profitable business. They should pay whoever works for them.

I can do some free work for wikipedia, osm or the like. But not for Google.


Is this essentially mechanical Turk, but without paying the laborers?


Not really, you can't get child labor on MTurk (officially). But hey, you get badges, and you help the community by providing them training data they'll lock up in a server and use to monopolize more sectors of the economy. Everyone wins when google can do better sentiment analysis on your non-english emails.


[flagged]


Neither did you. They release 3 datasets partially out of dozens

https://news.ycombinator.com/item?id=31385975


Your aggressive tone is completely inappropriate, especially considering that you are mostly wrong. They have released a single dataset, and specifically state on the site that only some of the data will be open sourced. So, perhaps it is you who should be reading further.

"Google shares some Crowdsource data via open-sourcing, for the benefit of the global research and developer communities. This includes the 400,000 images that we've already released as Open Images Extended, with more sharing to come."

https://crowdsource.google.com/about/how-it-works/


First, they've released three datasets, not one. It's wild to expect they'd release all of the data, both legally and ethically. Publishing data that turns out to be garbage is worse than publishing no data at all. Publishing unfiltered user generated content when they also have the resources to evaluate quality is silly.

They've built and maintained the site that nobody else is seemingly building and offering. They've already released data from the site that's high quality. They have committed to releasing more. What more do you want from them? All things considered, this is better than essentially every other big company is doing when it comes to giving away free training data.


The GP made sarcastic commentary about Google keeping and using this community-generated data to empower themselves. Your response was to belittle them and arrogantly claim that Google is releasing the data. You were then shown that Google is only releasing a small (as implied by "some") subset of data publicly, which makes the GP's claim mostly true.

But now for inexplicable reasons, instead of just apologizing to the GP for being rude, you're scrambling to apologize for a corporation whose entire existence is notoriously based around collecting, abusing and selling user data, while also brazenly colluding with all of the other big companies to accomplish that abuse.

And for the record, the "three" datasets you're attempting to nitpick -- again without doing any of your own research, but instead quoting some other comment -- are actually just three partial subsets of the same Open Images[1] dataset, which itself is one of many datasets. So, they've still only released (and intend to release) a small portion of the overall data.

Your only intent was to make someone on the internet feel stupid, when in reality they were mostly correct, and you are still mostly wrong. Your previous comment got flagged to death. Take the hint. Doubling down isn't productive.

[1] https://storage.googleapis.com/openimages/web/index.html


It's even wilder to say both

> they are releasing the data under a permissive license.

and

> It's wild to expect they'd release all of the data

and still have an attitude.


Seems like, with a better name tho.

But you get fun out of it! And you make a difference (in Alphabet's bottom line).


..or "Crowdsource for Google"


Scammy how they're spinning it with language used by non-profits/charity: "Learn why your help matters".


From the authors of Google Reader, it-was-free-for-a-decade-but-now-pay-or you'll-lose-all-your-domain.com-emails-handled-by-previously-free-workspace-product, the secret agreement with Facebook for limiting competition in programatic advertising https://www.nytimes.com/2021/01/17/technology/google-faceboo... now arrives "Work for us for free in our proprietary product in exchange of a badge". What a big load of shit.


Could have, at least, be a NFT


To add insult to injury, they coined the "Don't be evil" mantra. Good for hiring, for onboarding users... And then the daughter of Eric Smith, from all the companies in the world, she chose Cambridge Analytica for the internship. AKA: the best social profiling algorithms in the world based on personal data (eroding your privacy). https://www.businessinsider.in/Emails-link-Peter-Thiels-Pala...

Media has too much fear to talk openly about what's happening with Google from some time ago.


Could have been yes.

Imagine if it was a beanie baby instead man. I think those are going to the moon.


Crowdsource by Google aka "Help a billion dollar company to create datasets for free"


Make that a trillion


Rare sighting of the long scale in the (English speaking) wild!


We are already providing search clicks and video likes as indicators of content quality. They train AIs on our preferences.


It's not working so well though, in my experience anyway


Does anyone else hate what corporations have done to the word "delightful"? All the corporate art is so homogeneous too.

Also a trillion dollar plus market cap company trying to get free labor? Seriously?


> Does anyone else hate what corporations have done to the word "delightful"?

Yes, and also to the word "exciting", the most overused word in the history of all human language.


It's giving free services. What about that? And, not just this one company many other trillion $$ market caps give free and paid services and also welcome free feedback.

When you comment on Yelp, Google Maps review, YouTube it's same.


"free services"

(...subsidized by the huge amount of ads inserted everywhere)


Yes, it was those evil “boomer” companies that charged us money and gave us poor UX. Google saved us from that, praise Google.

I call this “digital tunnel vision”.


We all need to contribute to save Google from the dangers of nonrecord profits


"People like you helping people like us help ourselves."


Basically, "How to make people to work for free 101." What is the value of those badges?


They could at least give out some NFTs :)


The app was initially released 2016, with the last update half a year ago. Did anything happen to get this posted?

If not, maybe add a (2016).


Now let's imagine it's fun enough so that people don't see it as free labor, but see it as genuine entertainment, like, say, Wordle.

How would they weed out bad-faith players who would try to teach the AI bad ideas, just for, well, fun of it?

I don't believe they might fail to anticipate that; I want to understand their approach.


If you ask enough people the same question, it's easy to identify and drop outlier answers.


That depends how political the question is.


As far as I see this, Google want us to do the job...for free. No, thank you very much. Payment for this should be mandatory. Do not work for free for a humongous company.


Not for free! You can get useless badges!


I suggest pairing this with another front-page HN article from today: https://thehustle.co/why-free-stuff-makes-us-irrational/ and in that article, s/us/Google/g.


So this helps improve Google’s proprietary products. But they aren’t willing to open source the data set?


According to this page, they have already open sourced some datasets: https://crowdsource.google.com/about/open-source/


But they say nothing about it in the linked announcement.

> Help Google create AI that understands your language and culture. Make your favorite apps and services even more useful and delightful for your community.


I’m sure they’d generate more, and better, data if they committed to open source.


They should avoid using the word community, if side product of these contributions are going to empower their sales. I had been contributing to Google Maps for more than 8 years now, but in return all i get is stars and maximum an invite to a meetup. But, when i saw the other side of Google Maps with respect to consuming it through API usage point of view, the prices are not affordable in long term. Also, there is a clear monopolization of maps data, but the data is enriched by human volunteers who never get the share of API revenue. In this age of web3 all these crowdsourcing and community keywords from companies like Google will earn more backlash than a warm welcome.


They are not forcing anybody to contribute, it's free and at will, just like community garden. Even walking in garden is not free, cause somebody's tax money is at use. If you are not paying taxes.

They do provide many free services. Google Maps being one of them, you liked their product and used it and it was very useful hence you contributed back to improve it.

If you had paid for their services then you could complain but if you are using it for free then don't use it for free and then complain. You could always use other products or paid products like GPS from garmin.

I don't understand why so much hate towards companies which are supporting free products and enriching lives of billions of people.

If these free products were not there can you imagine how so many poor people across globe could afford free internet technologies?

// Not google employee, opinions are mine.


You should put your efforts into open street map instead


Yes, im looking for an alternative app for StreetComplete, which is not available in App Store.


I'm curious why you'd put so much free effort into a paid-for product by a trillion dollar company.


To be honest, until i saw the API side of Google Maps i was under the assumption that Google is doing great service with Google Maps. Even now Google Maps gives the accurate result thanks to all the contributors.


But the Google Maps API isn't free so I don't think your point applies here.


I think you are saying the same thing? Without the API it sounds like a great community service that is kept accurate by its users. But the paid-for (and pricey?) API that profits from this is what gives a bit of a sour taste to the 'community' feeling.


you're getting the use of the service..... for free...Yeesh!

Consider before Google Maps. You'd either (a) buy a local map every few years (b) pay for a car navigation system. Now Google comes along and offers you a free maps and navigation service and you're complaining that if you try to make it better you're being screwed over? IT'S FREE!


Yes, it's free and good. Im not complaining about the Google Maps as a product, but i'm complaining about the system built around Google Maps. 1. I don't know how many contributors of Google Maps know the API side of the story. If they knew that will they be ready to contribute for free? 2. Though Google might have touched about selling user data in Terms & Conditions, the Terms & Conditions themselves are big trap worded with complex sentences having many pages. Yes, i may be lazy/dumb not to go through pages of T&C, and i accept that part of my mistake. 3. For a Google Maps contributor, when they contribute apart from giving them the points and appreciation, Google never tells them their contributed data is going to be sold in the form of API (in a visible manner). Whereas in YouTube the contributors do know, they may get compensated upon N number of views.

In overall the point is similar to everyone's complaint on Google, "User data is being used to their favour" but my expectation is for Google to openly and visibly tell that so all the competitors have level playing ground.


Google is making a lot of money on Maps. The consumer app is free only so you feed them free data from your rides. They also kicked out of business all competition (hard to compete with free). And as usual with monopolies, when there’s no competition, you can do what you want. Wait for it. In few years it will be shut down or will cost a lot if they no longer can sell your data to someone else.


Google Maps has lots of competition. Bing, OSM, Apple, CityMapper, HERE, Yahoo Japan, TomTom, Garmin, Waze even though they own it… I could keep going.


This is kinda funny.

> "Connect with others around the world"

The only connecting I'd want to do is with Chinese people to practise my Chinese. Inwhich my horribly broken Chinese would be a detriment to the algorithm.


> Find Your Passion: Spotting food or animals in photos…

> Get Recognized: earn cool badges, level up on the leaderboards…


Is strange that they don't pay contributors for this at all. Google frequently pay me in Play store credit via Google Rewards to answer questions about places I've been and videos I've watched.


If Google asked me to help with solving world hunger, I wouldn’t do it. Helped them enough with my data, and my lost faith in humanity when they were complicit in ruining the web.


How much do they pay?


I may be coming too cynical on this. So folks are supposed to add value for google for free. And Google can get away by gamification tricks?


> As a member of our global community of contributors, you're helping to create AI that can best serve the rich

Ok


"work for us and don't get paid"

"But you can look at this image of a medal"


Handover your most sensitive information and get “cool badges” in return. Sign me up!


You truly are the product.

Don’t feed the FAANG.


None of the FAANG companies are cartel or forcing any users to use their products at gun point.

Don't like their products then don't use it. Use alternative.

They built one of the best products hence most of the world is using it.


This has gotta be trolling, but in case not.

Your comments are demonstrably disingenuous. You can't really opt-out. Employers, friends, family, partners, colleagues, peers, etc all rely on you relying on the FAANG/MAAMA networks and network-effect one way or another whether we want to or not. These companies provide what have become public utility services that large parts of society now depends upon.

"Best products"? (this was the troll right? ;) More like "only products", with positions that were achieved in no small part through anti-competitive, anti-consumer behaviours that have run almost unchecked for the better part of two decades.

So what alternative are you talking about? Opting out? That's only an option in the same way that you can theoretically "opt out of society" and go and live in a cave for a year or two before you die of malnutrition/illness/etc.

It's a capitulation to the embedded powers-that-be to imply this theoretical qualifies as practical alternative.


Not true. In many European cities you cannot rent a bike without being a customer of either Apple or being the product of Google.

Even buying public transport tickets gets hard without the app, which of course is not on F-Droid. Often you pay more on other channels.


Man, you are all over this defending the behemoths. Can I see your GOGL tattoo??


Au contraire! DO feed the GRAFT - misinformation, noise, wrong answers only.

PS. The 1960's called and wanted their slogan back: "Fight the computer, fold your punch-cards."


Wow, did they intentionally try to make the most loathsome thing possible?


Aren't they already doing this with reCAPTCHA?


[flagged]


It isn't slavery when there's people willing to do free work for Google (as perverse as it may sound). Just look at their product forums.


Responding to parent comment:

  People want free stuff and then complain if some entity asks help.

  How's seeking unpaid help digital slavery? You must be very charitable person.

  And FYI here are some definitions from dictionary, in case you meant something else by "digital slavery" 
slavery : the state of being a slave. slave : a person who is the legal property of another and is forced to obey them.


Maybe they are after more diverse answers than could be provided by their own employees. Meaning they don't hire diverse enough.


So many comments bemoaning a company benefiting from the free labour of others. Wait until they hear about open source software, they'll do their nut!


Curious as to why this is getting downvoted? It's literally how the economics of open source software works. Vanishingly few companies put back anything like the value they extract from the open source ecosystem and many, many projects are funded overwhelmingly by the free labour and effort of their committers.


With Open Source everyone can benefit, including a company. In this case here only Google benefits.


So if Google services behave in a tone-deaf way regarding the culture and languages of non-western countries, now it's the fault of us the foreigners for not contributing enough to the AI?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: