What was the bug? I configure DNS for both public and private networks on cloudf...

stult · 2026-04-17T13:34:21 1776432861

That had previously been my experience with CF too. In this case, I was migrating my domain over from the registrar, and updated the nameservers to point to CF as per the standard practice, then waited for CF to detect the updated DNS records. Two days later (well after DNS should have propagated) CF was still displaying an error saying the update to the DNS record for the domain hadn't been detected.

There's not a lot of UI surface area that a user can touch that can even theoretically affect the NS detection process because that process happens in CF entirely "under the hood" as it were. You more or less just have to wait for CF to detect the DNS changes. That said, I tried everything I could think of to try to trigger their detector to reset, including deleting and recreating the site from scratch in CF. After another few days of combing through CF docs and forums, and after changing and reverting every setting I possibly could, I concluded there was no workaround available to me as a user and tried to reach CF as I described above.

Having done this many times before, I am quite certain that I set the nameservers correctly. I even had two other very experienced engineers review what I had done to make sure I wasn't falling victim to some mental blindspot that prevented me from recognizing what the problem was. I think every SWE has had the experience of spending an enormous amount of time debugging a problem only to realize they mistyped a magic string somewhere, but for whatever reason their brain just straight up refused to recognize the typo, but unfortunately that was not the case here. The other engineers saw what I saw and also were unable to fix the problem.

I was subsequently able to set DNS up on Vercel without any trouble at all. Bottom line, the issue was almost certainly a bug in Cloudflare's code. That indicates a code quality problem to me, which, in combination with the reckless incompetence that it takes to try to automate customer support with a chatbot that doesn't even have accurate information about their own processes and basic contact information, never mind a reasonable escape hatch to actual human-provided support in unusual cases (even for a paying customer), has led me simply not to trust them to deliver a reasonable quality product anymore.

They didn't even maintain any mechanism for reporting bugs to them, which is just insanity because it means there is no way to inform them even in extreme cases like a critical security bug. I get that they want to cut costs by reducing the employees needed to deal with customer service complaints, but it costs practically nothing to have a little feedback form somewhere, especially now that an LLM can handle most user feedback processing. Or failing that, a functioning support email address or phone number. But they can't even clear that incredibly low bar.

All of these issues could have been avoided with a very limited application of ordinary common sense and foresight. Whoever programmed their chatbot did not take the time to set up a decent RAG system with up-to-date information about their support processes and how to contact them, even though that is an obvious requirement for a tech support chatbot. They should also have recognized the business risks posed by exposing their customers to a system which lacks any escape hatches for outlier cases requiring actual human support, which risks alienating customers like me by forcing us to jump through Kafkaesque bureaucratic hoops just to get simple problems addressed, and--even worse--making it impossible to resolve such problems after jumping through all their hoops. The team implementing this chatbot didn't even think to include a contact form as a last resort method for reporting problems to them when the chatbot gets in over its head.

Most people hate this kind of LLM-provided customer support without any human escalation options, because the bots often end up uselessly looping through some debugging steps that simply do not work for the customer's specific issue for whatever reason, which feels like slamming your head against a wall repeatedly. It's a truly infuriating user experience and is practically guaranteed to destroy the business's public goodwill and reputation.

All of which means they are gutting their customer service department following some process that lacks access to these very basic insights, which screams mismanagement to me.

I'm not exactly a huge customer, but between my personal and business sites, I plowed $45k into CF last year, and will spend not another penny on them this year, or ever again. Maybe that's not huge spend in the grand scheme of the tech industry, but at a minimum that amount of money should entitle me to some human-provided support. My annual spend alone could provide the budget for multiple offshored CSRs. If I am spending enough money to buy a car, the least they can do is let me send them an email when I have a problem instead of just throwing me to the wolves.

Ultimately, they have a much weaker moat now than at any point in the past, because LLMs make it so much easier to build out critical functionality in-house that previously would have been worth paying someone else to manage via a SaaS. And while I may not be a big enough customer for them to worry about in and of myself, I am also not the only person affected by these business practices. Every affected person increases the reputational harms suffered by Cloudflare, with another alienated customer like me bashing CF in posts like this or in conversations with their friends and colleagues in the industry. Those harms should be very concerning to CF's management because it is extremely difficult to recover lost goodwill.