

Group read on “Surveillance Capitalism” but in truth…
- tinkered with Linux as a kid
- contributed to Mozilla
- loved the ideal of free software relatively early on
… so it was rather coherent with related yet orthogonal efforts.


Group read on “Surveillance Capitalism” but in truth…
… so it was rather coherent with related yet orthogonal efforts.


A friend of my is a researcher working on large scale compute (>200 GPUs) perfectly aware of ROCm and sadly he said last month “not yet”.
So I’m sure it’s not infeasible but if it’s a real use case for you (not just testing a model here and there but running frequently) you might have to consider alternatives unfortunately, or be patient.


Treating Google and Meta as apolitical …
I didn’t.


Tracking from WHOM and thus WHY should be the question.
It’s different to be tracked for profit, e.g. Google or Meta, versus for political or corporate espionage purposes.
The former is basically volunteering information through bad practices. Those companies do NOT care about “you” as an individual. In fact they arguably do not even know who you are. Avoiding their services is basically enough. It might be inconvenient but it’s easy : just do not.
The later is a totally different beast. If somehow the FSB, because you criticized Putin, or NSO Group, for something similar or because you have engineer something strategic to a business competitor who is a client of theirs, then you will be specifically targeted. This is an entirely different situation and IMHO radically more demanding. You basically don’t have to just care about privacy good practices, which is enough for the former, but rather know the state of the art of security.
So… assuming you “just” worry about surveillance capitalism and hopefully live in a jurisdiction benefiting from the Brussels effect with e.g GDPR related laws, either way is fine.


I’m new to Linux from about 3 months ago, so it’s been a bit of a learning curve on top to learning VE haha. I didn’t realize CUDA had versions
Yeah… it’s not you. I’m a professional developer and have been using Linux for decades. It’s still hard for me to install specific environments. Sometimes it just works… but often I give up. Sometimes it’s my mistake but sometimes it’s also because the packaging is not actually reproducible. It works on the setup that the developer used, great for them, but slight variation throw you right into dependency hell.


I’ll be checking over the subtitles anyway, generating just saves a bunch of time before a full pass over it. […] The editing for the subs generation looks to be as much work as just transcribing a handful of frames at a time.
Sorry I’m confused, which is it?
doing this as a favour […] Honestly I hate the style haha
I’m probably out of line for saying this but I recommend you reconsider.


Exactly, and it works quite well, thanks for teaching me something new :)


There’s no getting around using AI for some of this, like subtitle generation
Eh… yes there is, you can pay actual humans to do that. In fact if you do “subtitle generation” (whatever that might mean) without any editing you are taking a huge risk. Sure it might get 99% of the words right but it fucks up on the main topic… well good luck.
Anyway, if you do want to go that road still you could try
.mkv? Depends on context obviously)*.srt *.ass *.vtt *.sbv formats

Oh…, that’s neat thanks!
So in my use case I made a template for prototype metadata, add a menu action could be to generate the file instead of creating from the template via Exec= field. This would prepopulate the metadatafile with e.g. the list of selected files thanks to %F.



Sad but unsurprising.
I did read quite a lot on the topic, including “Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass.” (2019) and saw numerous documentaries e.g. “Invisibles - Les travailleurs du clic” (2020).
What I find interesting here is that it seems the tasks go beyond dataset annotation. In a way it is still annotation (as in you take a data in, e.g. a photo, and your circle part of it to label i.e. e.g “cat”) but here it seems to be 2nd order, i.e. what are the blind spots in how this dataset is handled. It still doesn’t mean anything produced is more valuable or that the expected outcome is feasible with solely larger datasets and more compute yet maybe it does show a change in the quality of tasks to be done.
Enforcing GDPR fines would be a great start, only adding more if need be.
I feel like we could are more laws but if they are not enforced it’s pointless, maybe even worst but it gives the illusion of privacy while in reality nothing changes.
None of your requirements are distribution specific. I do all (Steam, non Steam, Kdenlive, Blender/OpenSCAD, vim/Podman, LibreOffice, Transmission) of that and I’m running Debian with an NVIDIA GPU. Consequently I can personally recommend it.


FWIW MSN is from Microsoft so IMHO might be better to link to the original source and, if need be, remind people they can either pay for journalist content or use services like archive.is which will bypass paywalls.


Funny to read this right after finishing an Elden Ring : Shadow of the Erdtree gaming session.
Yes, so it should. In fact I’m gaming on Linux on :
…and I don’t even think about it. I just play regarding of the game being AAA or indie, VR or “flat”. It just works.
Brand new example : “Skills” by Anthropic https://www.anthropic.com/news/skills even though here the audience is technical it is still a marketing term. Why? Because the entire phrasing implies agency. There is no “one” getting new skills here. It’s as if I was adding bash scripts to my ~/bin directory but instead of saying “The first script will use regex to start the appropriate script” I named my process “Theodore” and that I was “teaching” it new “abilities”. It would be literally the same thing, it would be functionally equivalent and the implement would be actually identical… but users, specifically non technical users, would assume that there is more than just branching options. They would also assume errors are just “it” in the process of “learning”.
It’s really a brilliant marketing trick, but it’s nothing more.
To be clear, I’m not saying the word itself shouldn’t be used but I bet that 99% of the time if it’s not used by someone with a degree in AI or CS it’s going to be used incorrectly.
The word “hallucination” itself is a marketing term. It’s not because it’s been frequently used in the technical literature that it is free of any problem. It’s used because it highlights a problem (namely that some of the output of LLM are not factually correct) but the very name is wrong. Hallucination implies there is someone, perceiving and with a world model, who typically via heuristics (for efficient interfaces like Donald Hoffman suggests) do so incorrectly leading to bad decision regarding the current problem to solve.
So… sure, “it” (trying not to use the term) is structural but it is simply because LLM have no notion of veracity or truth (or anything else, to be clear). They have no simulation to verify from if the output they propose (the tokens out, the sentence the user gets) is correct or not, it is solely highly probably based on their training data.


What’s your ramp up prediction. I tinkered with a Banana Pi last year but don’t know about the broader implementation and production process so I’m curious, when do you expect performance parity for say <10x price (meaning performance of RPi for less than 500e) with e.g. RPi4, RPi5, Core i9, etc?
All my services are fine. I self host. Yes I’m quite pedantic about it. :D