Hacker Newsnew | past | comments | ask | show | jobs | submit | majestik's commentslogin

Is anyone here actually surprised Meta is recording and reviewing their content?

Vote with your dollars people.


I deleted my Facebook eleven years ago. I wish I could say it was for some cool reason about privacy concerns and whatnot, but honestly it's because I was spending way too much time arguing with people I barely knew, and I figured that that's not healthy.

I missed Facebook for about a day, and after that I barely even thought about it. In 2021 I bought an Oculus Quest 2, which at the time required a Facebook account so I made a throwaway one, but other than that I haven't been on Facebook (and I haven't even touched my Quest 2 in three years).

Point being, it's really not hard to get off Facebook and to ditch Meta products. More people should delete it.


> Point being, it's really not hard to get off Facebook and to ditch Meta products. More people should delete it.

As another poster mentioned, it can in fact be more difficult. Almost all of my social clubs/groups over the years migrated away from websites/forums to FaceBook. I could give up an account, at the cost of losing effectively my entire social calendar.

I have a generic account with no real user data, but they still get all my content from the social groups so they still win I suppose.

My point ultimately I guess is that I have chosen the ability to continue to have a strong social life over my zuck hating principles.


My policy for years with facebook has been "post, don't scroll". I also use the brave broswer, ublock origin, and fb-purity extension. It's a tiny thing, and petty but it's better than being facebook's product for their advertising customers.


I don't actively use Facebook and I block most(?) of the tracking, but I do have an account simply because most of the information about my area is on there. This means events, safety updates, second hand shit.


Yeah, that's fair enough. My neighborhood doesn't have that so it's fairly easy to avoid the use of Facebook.

I still spend too much arguing on HN but not as much as I was on Facebook and the audience here is generally more educated and so the arguments aren't as mind-numbing.


Yes, I'm surprised at this. I would've never expected they would be doing this, and I didn't exactly have high expectations of Meta. This is incredibly invasive and not at all what people expect.


Am I so cynical, or does this sound hopelessly naive? This is exactly what I would expect. Certainly of Meta. Amazon had to go out of their way to reassure people that Siri wasn’t always recording. And I’m still not entirely sure I believe that.


I would have been the first to talk about meta being horrible for privacy and this goes even further than what I expected, which was:

- they have the opportunity to save the video feed at any time - they are probably storing some kind of metadata of the feed, maybe some kind of analysis output - someone could hypothetically watch it

I thought it was dangerous because I thought they could do what they're doing, but I didn't think that right now they actually were and so overtly


I am also surprised, but not because I believe Meta to care about the ethics of the whole thing. After all their privacy scandals, I’d assume they’d have policies in place to prevent something that can so easily be leaked. But here we are


The thing is it's not just surprising from a privacy standpoint but also from an engineering standpoint -- this sounds very data-, power-, and storage-intensive, in a device that's very constrained on all sides, so it wouldn't have even occurred to me this was a possibility. When are they even uploading all the videos without blowing through their power budget and internet data limits? Are they heavily compressing it to like one frame per second or something?


> data limits

The data required is small. Each embedding might be 1/2 kB per face.

> power budget

To process a video for biometric feature extraction, it might take 0.5% to 2% of the total power used to record a video. Video uses a lot of power (compression, screen, etc)

Assuming you've got a modern device (e.g. with Apple Neutral Engine). Disclosure: Googled info (Gemini).


> The data required is small. Each embedding might be 1/2 kB per face.

"Embedding"? This is what the article says:

"In some videos you can see someone going to the toilet, or getting undressed. I don’t think they know, because if they knew they wouldn’t be recording."

You're saying they're watching "embedding"s here?


I mean, you store as many as you need right? I’m sure 99% of the data is just immediately trashed. They have enough home videos of kids to not need to label them again, but those videos where someone is undressing in their bedroom will show up as ‘undertagged’ because few people make those videos.


I find it extremely naive too. I expect much worse than this from Meta and I am often amazed at just what it is going to take for people to realize what Meta is and does. I mean it is not like we have 11 million examples of what and who they are. In this story I would have expected additionally that Meta would notice little bit of cellulite in the woman that was changing and then having the employees call her husband to tell them to surprise her with amazing cream he should buy her for their upcoming anniversary (and if this was actually part of the story I would be able to continue on top of this and would not be surprised if true).


Amazon Siri?


Haha, yeah. I wrote that, thought something was wrong, but I couldn’t put my finger on why. Figured it probably wasn’t important to the point :)

But yeah, should have been Alexa


Yeah, this is something you 100% should have expected. This could not be more on brand for facebook. Even if someone told me facebook wasn't using their glasses to invade the privacy of their users I wouldn't believe them. Compromising people's privacy for profit is what facebook does. Violating the trust of their users is basically all facebook has ever done.


I’m not sure what sort of signals you’ve gotten from Meta that would suggest they are above this type of behavior?


> I’m not sure what sort of signals you’ve gotten from Meta that would suggest they are above this type of behavior?

It wasn't Meta's morals that gave me any signals to that effect. It was the potential legal minefield on top of the engineering challenges [1] that made it so I didn't even consider this as a possibility. In fact I'm still confused. I don't understand how they would be pulling this off despite those challenges, and I would love to.

[1] https://news.ycombinator.com/item?id=47225772


When you buy them and set them up you are told this many times. The onboarding screams at you that everything you do is used for training AI.

Maybe this changed since I set mine up, but I felt so damn informed I was getting tired of tapping I understand.


There's plenty of people that don't own these smart glasses, as far as i know it's still only early adopters using them but i guess i could be wrong. The nice thing is you actually can vote with your feet here because there's no network effects, whereas there's tons of people that are stuck being on facebook or instagram because of everyone else that's on there.


Yes, and this is a good start:

https://github.com/hagezi/dns-blocklists?tab=readme-ov-file#...

Among others, blocks Meta/Facebook/Google/Apple trackers and ads. Every router on the planet should run this.


Your dollars don't matter. They get so much state funding that this is just how the future is going to be. You'll like it.


I was already a Tesla owner and I reserved a Cybertruck right after I saw the original Cybertruck Unveil live stream on November 21, 2019. The infamous one where the window glass shattered.

That was when it was supposed to cost around $35,000.

Four years later when my reservation was ready to order, on December 8, 2023, the CyberTruck cost more than $100k.

Because it cost almost 3x more than what was originally advertised, I cancelled the order. I know many other people who canceled for the same reason. Keeping in mind this was after several delays, so I and many others with reservations were already frustrated with the product before it became available to order.


Google welcome to Apple 10 years ago


I can't put my finger on it but there's a weird tension between the two Dave's in this video. Almost like Rosenthal is trying to impress or earn the praise of Scherer.

Is there a backstory between these guys / FDB?


Ha, well I met Scherer ~30 years ago in a high school math class and we’ve done three companies together, so you could say we’ve known each other for a bit :)


Lol - to find your one comment in this thread wayyy down at the bottom was delightful. That demo you reference from techcrunch disrupt ( start of this video - https://youtu.be/Nrb3LN7X1Pg) - that is a great demo. I'm curious - is that an idea/moment in your life you look back on and has that staying power, something that randomly pops in your mind that fills you with pride, or not really?


We thought of that demo with like a week or two to go before the event when we realized we needed something more visceral than a blinking cursor :)

So, it stayed with the world because it went semi viral in some tiny circles. But it didn’t stay with me because I personally learned little from it.

The thing that stays with me is all the things I learned spending years trying to implement an API you can describe in ~30 seconds really well. It’s a rare opportunity to be able to go so deep on a problem in a career.


TLDR: AirBnb is spending “hundreds of millions” to become the new Craigslist.


I read the article, and while it’s great the model can generate relevant output- so what? The article doesn’t discuss any action being taken using that output.

So what’s the big breakthrough here?


This deal isn't about security, it's about data.

Google already have one of the best security teams in the industry - Project Zero [0]. They don't need Wiz's "enterprise" expertise for security.

This deal is about DATA. Wiz, as a cybersecurity vendor, have full remote access to their customers cloud compute storage (EC2 EBS volumes, etc) in the name of "security scanning" - this is actually part of their unique selling point - "agent-less scanning" which is unlike traditional security tools that require an agent installed in the OS. Instead, Wiz is able to just clone your full data volume and scan it locally in their cloud accounts/VPC.

With this deal Google has bought a ton of confidential data from Wiz's customers without their explicit knowledge or approval, and they will use it to improve Google's AI models like Gemini and probably several other products.

A year ago Google struck a $60M/yr deal with Reddit to exclusively license their content [1] for the same reason, and that data is probably much smaller and less valuable than the data Wiz has access to from their customers, which include companies like Morgan Stanley, DocuSign, Slack, Plaid, and others. [2]

Sources:

0: https://googleprojectzero.blogspot.com

1: https://www.reuters.com/technology/reddit-ai-content-licensi...

2: https://www.wiz.io/customers


I find it hard to believe (or maybe I don’t want to believe) that this could ever happen? Even if Wiz has T&C’s that allow full access to clients’ data, and even if the T&C allow some sort of “use” of that data that includes training an LLM, surely you can’t release an AI trained on private information to the public? You can’t have Gemini spitting out internal/private/confidential information?

Am I just naive?


na you're right this would be a dumb move with a huge blow back


It's only dumb if they get caught doing it. If they do it once and keep it quiet and then someone finds out 2 years later, it's going to be a footnote in history.


I'm guessing you would be the same guy who wouldn't torrent millions of books and copyrighted works to train your LLM. Zuck can afford not to care about that pesky detail

You are not naive, you are not considering that at certain scales, your concerns are the cost of doing business.


Not the same thing at all. Corporations care about their data a lot and would cancel deals over this. Noone cares if some authors get upset, they have no leverage. Disappointing how people will make confident statements while being so clearly clueless.


> Corporations care about their data a lot and would cancel deals over this. Since you have mentioned "a lot" share few examples, pls.


So many sources yet no source of the actually outrageous claim that Google will use this to illegally siphon customer data

maybe this deal is about a company with a lot of revenue in an area google is heavily investing in: cloud security?


Facebook did exactly this with a VPN acquisition. They didn't break into customer data; they just mined it for usage patterns.

So as a pure speculation on Goog's motives, it doesn't sound farfetched enough to call ridiculous. Competitive data is valuable, particularly if you want to strangle the youth in their cradles (or acquire them).


google is not facebook, and an ad-supported consumer software is not cloud. OP talked about AI training which is a bit more than metadata

also, the vpn example ended in court


> actually outrageous claim that Google will use this to illegally siphon customer data

Hypothetical question as much as anything: If Google purchases a company and the data the company stores about their customers, is it illegal for them to use this data for whatever they want?

Lets say it would give them an understanding of what features from AWS people tend to use the most, and they use that to improve Google Cloud, would that be illegal?


yes, due to privacy and contract obligations

as well as this is the surest way for GCP to spectacularly commit suicide


Unless you're talking about some specific Wiz<>customer contracts, how do you know?

AFAIK, there are no explicit laws forbidding that. Maybe you could share what law you think this would be breaking?


OP mentioned training AI on customer data

GDPR, CCPA, HIPAA, etc, as Google has no way of knowing which data they will train on, add to that copyright and that's just off the top of my head

cloud contract obligations are also pretty clear about customer data.

furthermore it would be bad engineering and security if Wiz had actual direct access to customer data, versus having their code having access to said data. That would be a huge issue in due diligence for example


Did you skim through Wiz's Privacy Policy? They're keeping a lot of stuff that isn't "direct access to customer data" and already permitted to be sent to 3rd parties, wouldn't surprise me if you could aggregate what features are most used on AWS by collating some other sources than having actual access to customers cloud.

Obviously, existing agreements would need to continue to be run properly, no question about that. But there is always plenty of other data that probably could be used by Google to gain some insights.


what you talked about is different and is aggregated metrics

that might be legal and interesting but i highly doubt it's 30+ billion dollar interesting

i imagine you can buy that data from data brokers without any legal exposure but that's only a guess


Read through the Wiz MSA [0] at section 6 which discusses “Customer Data” and among other things specifically asks Customer not to send HIPAA data (perhaps to sidestep the issue you just raised) and concludes with this:

Customer hereby grants to Wiz a non-exclusive, worldwide, royalty-free right to use Customer Data to provide the Services and perform its obligations under this Agreement.

Or if reading terse legal documents isn’t your thing, go ahead and just read through Wiz’s own blog post about how their scanner works, which confirms they have full, direct access to customer EBS volume snapshots in the default “full SaaS” deployment model. [1]

Your point that due diligence would have taken issue with this might not be grounded in Google’s reality.

0: https://wiz.pactsafe.io/legal#wiz-subscription-agreement

1: https://www.wiz.io/blog/the-wiz-approach-to-agentless-scanni...


> [access to use customer data...] *to provide the Services and perform its obligations under this Agreement.*

"Services" – which you'll note is capitalized... lawyers do that for a reason – has a very specific meaning that very obviously does not include "whatever the fuck Google wants to do with it", nor "training general purpose AI models" in particular.

Why are you intentionally and blatantly misinterpreting Wiz's policies? Or are you just that good at ignoring/missing details in order to weave the story you've already decided to believe?


I've been consistently surprised at how common bad engineering and security practices seem to be within the security vendor space though. So idk this just makes it sound more plausible to me cause this would be exactly the type of company to have a scandal like that.


This is an incredibly stupid take on the deal.


This is an incredibly useless comment [0]

At least say why you think so and contribute to the conversation a bit.

[0] https://news.ycombinator.com/newsguidelines.html#comments


theres no need to wrestle with pigs


The comment effectively says "wake up to yourself, this nonsense isn't welcome".

Some things are self evidently stupid, cynical and/or disingenuous to anyone with a modicum of intelligence and a cursory understanding of the field.

Use your hall monitoring energy to add value. The type of post I call out here reduces the value of the forum.


Google isn’t buying Wiz for “security expertise”, they’re buying Wiz for a security product, in a growth area, that customers absolutely love. You’ve provided no evidence for the conspiracy theory that google is buying Wiz to siphon up a bunch of data, and if you’re going to link to Wiz, maybe link to their public list of security certifications, many of which prohibit the type of data harvesting you are suggesting.

https://trust.wiz.io/


"Trust" screams insecurity. Security is in the direction of trustless rather than requiring trust. Do you trust companies which say front and center "you can trust us"?

Wiz is a "security product"? Security isn't something you can buy and bolt on to your systems as an afterthought. It doesn't work like that!


I’m honestly not sure what your point, if any, is.


That the security software industry is kind of full of shit sometimes is I think what they were getting at.


Yes. You put it more eloquently.


Based on the exceptional level of ignorance and outright delusion in this thread, I'd rather not speculate. Easily 1/3 of the discussion is mired in conspiracy theories about Israel, and another 10 - 20% are people who's comments can be boiled down to "you know, I've never heard of this product/company/industry before, but, by God, the world needs to hear my hot take."


I trust open source code I can see and compile and control. :)

How is "trusting wiz" (trusting some icons on website controlled by wiz leading to publicly inaccessible reports, half of which are done by a single company somewhere in Florida) related to what Google might do with it after aquisition?


That’s great. For you. Most businesses don’t have the ability or desire to build every single security tool they use in-house or use open source for everything. So they buy commercial tools. Which are audited by third parties to give the companies that use the commercial tools some idea of how their data will be used.

If google wants to maintain those audit findings, which they’ll need to do to keep most of their customers, that’s going to limit the kind of data collection they can do. Unless, of course, you want to propose a new conspiracy theory (which I guess would be par for the course in this thread) that Google is going to lie to their auditors to get at that sweet, sweet data (most of which they already have for their GCP customers and don’t need to buy Wiz to obtain.)


Google has GCP customer data, but Wiz tool aparently works not just for GCP, so there's a lot of competitor cloud data to be had from aquiring it.


I believe you are right in the direction, but wrong on the details. Yes they will now have tons of otherwise inaccessible data about how Wiz customer use GCP’s competitors (AWS/Azure), eg what workloads, how much they pay, how many EC2 / EKS / ECS / RDS / S3 / SageMaker are actually used and how much they pay. This is by itself highly valuable financial information, that any company would love to have about their direct competition.

I highly doubt Google or Wiz have a legal avenue that allows them to use customer data beyond fulfilling their product needs. Products like Wiz (voluntarily) go through security audits and certifications, from SOC2 type 2 to FedRamp. Also enterprise customers actually do read T&C (their legal team does at least) and having terms and conditions that allow you to train models on customer data without their consent is not going to fly under the radar for long.


Google has the best security. But it is hard to market real security (as oposed to snake-oil), so maybe this acquisition will help.


> Google has the best security.

Care to elaborate?


Google was owned pretty hard in 2009 (Operation Aurora). Following that they put security front and center in a way that few other vendors do.

You can read my praise of ChromeOS here: https://news.ycombinator.com/item?id=41178525

To add a few, Chrome was the first browser to introduce process isolation: Every browser tab, every site (second-level domain) and every iframe runs in its own sandboxed process.

With that it's the only end-user software (alongside the other browsers) that actually is secure against Spectre and Meltdown. Operating systems only protect against Specre/Meltdown leaks between processes.

Google invented Certificate Transparency and Chrome enforces CT since years. Firefox added CT enforcement only a few days ago.

CT solves the following: For example, if a rouge Chinese Certificate Authority decides to issue a cert for google.com to the Chinese government for Man-in-the-Middle attacks, CT blows their coverand makes it known to everyone that the CA issued a fraudlent cert.


Project Zero is about finding security issues, not about developing products to increase security.


Using private data to train a public LLM seems like a huge liability that Google's legal team would never approve. I could see them using the data for all sorts of kinds of analytics though. I heard Google deals in those a lot.


Project Zero and Wiz and have very little in common. It's wrong to bring these two up together as if they are comparable. Project Zero focuses on discovering and analysis of new (including zero-day) vulnerabilities. I do not believe Wiz uncovers new vulnerabilities. The skillset of someone working on Project Zero looks very different from someone working on Wiz.

The field of security is huge. It's unhelpful to lump unrelated things together.


> I do not believe Wiz uncovers new vulnerabilities

Oh they do. https://www.wiz.io/blog/tag/research

A few fun ones are the multiple cross-tenant security exploits they found in Azure (which is why, among the tons of other reasons, Azure is just the worst possible choice for a cloud vendor from the big 3 - their security is a joke, and none of the vulnerabilities below should have passed even a cursory security review, but they did, which means the whole org doesn't take security seriously. Add in the fact that it's slow as hell, and has the UX worthy of an Enterprise vendor, the only reason to choose it is because you're getting a good deal on the golf course for it):

https://www.wiz.io/blog/azure-active-directory-bing-misconfi...

https://www.wiz.io/blog/omigod-critical-vulnerabilities-in-o...

https://www.wiz.io/blog/secret-agent-exposes-azure-customers...

https://www.wiz.io/blog/chaosdb-how-we-hacked-thousands-of-a...


> They don't need Wiz's "enterprise" expertise for security.

Yes, because exploit discovery is exactly what enterprise security is.


This theory of yours is a conspiracy. Google would never start training off of confidential corporate information without authorization. The legal team would never allow it. And if they ever got caught, it would be a complete disaster for them.


That doesn't sound very secure at all


Thousands of lawsuits coming up? How are any of the mentioned companies okay with their highly confidential data being scanned by AI?


The top three topics of batshit conspiracy theory supported by precisely zero actual evidence:

1) Hidden cabals colluding in secret to control world events.

2) Extraterrestrial beings live among us secretly controlling world events.

3) Google illegally steals private data to secretly control world events.


“No punting — we can't keep building nanny products. Our products are overrun with filters and punts of various kinds. We need capable products and [to] trust our users.”

What does Sergey mean by this?


He is probably referring to "responsible AI" bullshit (i.e. censorship and refusing to answer questions).


Source?



Back online after nearly 24 hours, with no explanation.

Here’s the update from Twitter:

https://x.com/askplaystation/status/1888376843255824475?s=46


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: