Hacker Newsnew | past | comments | ask | show | jobs | submit | decide1000's commentslogin

Despite that I find the goal of what you are trying to achieve questionable, I believe it should not be the AI that judges you here.

We are all witnessing the start of an AI era that will not end soon. Guiderails are a part in this development. I do have questions about the people, or systems, that decide on what's good and bad behavior. This tech is used in any country in the world. As long as they are able to pay their subscription in dollars, someone is able to use it. Is it up to a company to decide what's good or bad behavior? Is this a debate? Is this politics? Is this just a vision of one company? Would it shift in time? Will it be stricter for more hyper-intelligent models? Will it change when open source models are becoming better and better?


With scraper tech I mean a rust binary that is able to download and process thousands concurrent urls (millions per hour). Not to the same domain obviously. Paying more is not the issue here, its more the idea that an AI decides on what part of the spectrum I operate. Why is it opinionated? I am not doing anything wrong, why does it make me feel like I have to defend myself.

What is the specific concrete purpose of downloading millions of URLs per hour across different domains if it's "not doing anything wrong"?

Mostly ecommerce and pricing data. I work for marketplaces, brands, retail stores and even our own saas competitors. We match the EAN (gtin) to the correct SKU within seconds (Google Shopping, Amazon, etc). Part of it is our own trained ML models.

Might be it for scrapping content for training an LLM? Oh no only big tech allowed to do it...

"The gangsters do it and get away with it so any random person should get to as well"? Not a particularly defensible position if that's an accurate paraphrase.

I am on the latest version available for me 2.1.98

Version 2.1.113 is available as of this comment. I think the brew version lags behind the other ways of installing it.

I am not using brew. Just checked and it still says 2.1.98. Will try manual update.

I built a distributed DuckDB setup using OpenRaft for state replication. Every node holds a full copy of the database. Writes go through Raft consensus, reads are local. It's more like etcd-with-DuckDB than MotherDuck-lite.

OpenDuck takes a different approach with query federation with a gateway that splits execution across local and remote workers. My use case requires every node to serve reads independently with zero network latency, and to keep running if other nodes go down.

The PostgreSQL dependency for metadata feels heavy. Now you're operating two database systems instead of one. In my setup DuckDB stores both the Raft log and the application data, so there's a single storage engine to reason about.

Not saying my approach is universally better. If you need to query across datasets that don't fit on a single machine, OpenDuck's architecture makes more sense. But if you want replicated state with strong consistency, Raft + DuckDB works very well.


I'm building a Redis real-time backup platform with PITR (point in time recovery) to the exact millisecond. Besides downloads it has an emergency recovery option where you connect the Redis client directly to the master-replica to receive the latest version.

The key explorer let's you change data on the fly and receive notifications in real time when a condition is met (if value contains X).

It's build in Rust on bare metal wit isolation between clients and data.

ReplicaSafe.com (nothing there yet, will take a few weeks)


I think this model proves it's very efficient and accurate.


But it could potentially be even more efficient if it was single-language.


I am on Linux for 26 years. Last 5 years I run PopOs! on my desktop. User friendly and stable, Ubuntu based.


Linux user for 25-ish years here, and exclusively so.

I used to heavily configure my ubuntu distros to be keyboard exclusive with i3wm and such, but I ended up with regolith desktop, a version of ubuntu with pre installed i3wm and keyboard focus. I'm too old to keep my dotfiles updated.

Nowadays, imo you should only choose the package manager, any os using that chosen package manager (aptitude for ubuntu) definitely had a version that's close enough to your use cases.


Similarly, I picked up Linux in 1997 and have used it as my primary since 1999. I've distro hopped through probably a dozen or so distros, but ultimately landed on PopOS for most of my machines, similarly to you. These threads are always somewhat disheartening, hearing everybody say they tried to switch but couldn't because of one issue or another. I guess I've just learned to work through it.


China and Europe (Mistral) show that models can be very good and much smaller then the current Chatgpt's/Claudes from this world. The US models are still the best, but for how long? And at what cost? It's great to work daily with Claude Code, but how realistic is it that they keep this lead.

This is a new tech where I don't see a big future role for US tech. They blocked chips, so China built their own. They blocked the machines (ASML) so China built their own.


>This is a new tech where I don't see a big future role for US tech. They blocked chips, so China built their own. They blocked the machines (ASML) so China built their own.

Nvidia, ASML, and most tech companies want to sell their products to China. Politicians are the ones blocking it. Whether there's a future for US tech is another debate.


> but how realistic is it that they keep this lead.

The Arabs have a lot of money to invest, don't worry about that :)


I am the co-creator of ShoppingScraper. Convert an EAN / GTIN to pricing information, product specs, content or image. API-based and rapid with pricing and barcode data.

The website needs some love, but the webapp is going well.

https://shoppingscraper.com/features


They say: Dashboard Login Latency and Timeouts

They mean: You server will get connection timeouts through CloudFlare proxy


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: