I agree with all of your points about the article, but...
> However, they should take it a step further, and just avoid Django in the first place.
Django is a tool, and like most tools it has a use. I find Django indispensable for writing specific kinds of applications, and it's admin interface is by far the best thing since sliced bread for internal/backoffice applications. It's not perfect by any means but it's amazing for quickly getting a CRUD app with an awesome administration interface up and running with minimal effort. Django rest framework, migrations and the amazing ecosystem (reversion, django-mptt, django-polymorphic, debug toolbar, django-currencies to name very few) are also what makes Django appealing and rather awesome.
Making a blanket statement like 'Avoid Django' is rather silly, given all that.
> Also, am I the only one that has never needed to use "migrations" or any similar sorts of features?
I'm not sure why you put quotes around migrations as if it's some alien, obscure or weird feature. If you've never written a web application that needs migrations then you're not writing the kinds of applications (or indeed any 'serious' application) that would benefit from Django IMO.
> it's admin interface is by far the best thing since sliced bread for internal/backoffice applications.
By the way, some best practices here that I've discovered:
- Use the Django admin for performing actions closely tied to specific models that don't involve any business logic. E.g. changing which one of a user's email addresses is their primary address.
- For admin business logic that spans multiple tables, (e.g. inactivating a user's email address and logging that action in a different table), just create an app called admin_api or something similar and then create DRF endpoints for performing this sort of admin logic.
The benefit of having your admin business logic wrapped in REST endpoints is that you are writing and testing all your admin logic the same way as you write and test all your other endpoints. And since all your admin business logic is just another Django app, you can create models for your admin business logic, e.g. for logging the results of your database integrity checks. And then you can use the standard Django admin on top of those tables, so you're basically putting an admin on top of your admin.
And because all your admin logic is encapsulated in rest endpoints, you have the option of either hitting those endpoints from Postman or some custom admin front end, or else hitting the service methods that perform the business logic for those endpoints directly from the Django admin actions dropdown list[1].
> The benefit of having your admin business logic wrapped in REST endpoints is that you are writing and testing all your admin logic the same way as you write and test all your other endpoints.
You write all your business logic in REST endpoints? That's insanity... Are you even doing RESTful things with all those endpoints?
Have you even considered the impact of HTTP overhead? This is a thread about scalability, after all.
Don't over complicate shit. This article isn't even about performance, it's about complexity it seems, but you're here promoting a HORRIBLE idea as a "best practice".
Microservices are one thing, but replacing your data access layer in INTERNAL code with a RESTful endpoint is kind of insane, and will only lead to problems later.
For example, very recently, I had to audit an app that was very very slow. They had recursive data calls that took 1000x longer because the idiot that slung them together used an inline CURL call via their main production API endpoint.
That one request then led to 1000s of other requests, which overwhelmed the load balancer because every one of those API calls triggered more in-kind API calls to fetch other data. But, because the request went back out through the load balancer, was made to another server which did not have the needed data in memory, so it makes a similar API call to fetch it, which then goes to another server behind the load balancer, and then it just devolves into a clusterfuck cacophony of bullshit and massive overhead and slowness where a simple Foo.get(id=blah) would have sufficed in the first place.
Their developers proposed solution? "hit localhost instead of the load balancer". Guess what, it was still very very slow, because of HTTP overhead. They finally listened, and killed that CURL request, and replaced it with a recursive call back to self, and suddenly IO dropped, requests were responsive, and they were able to remove half of their servers from the load balancer pool.
> replacing your data access layer in INTERNAL code with a RESTful endpoint is kind of insane
Nothing is being replaced. Each view has a service method that performs the business logic, so you can either call the endpoint or else call the service method. There is zero performance implication, and basically zero extra complexity.
I don't see exactly what the mentioned problem has to do with writing REST endpoints though.
Agree with OP. REST endpoint in many cases are the way to go. Particularly today when the front end might undergo who knows what transition and not really have much to do with Django at all.
> I'm not sure why you put quotes around migrations as if it's some alien, obscure or weird feature. If you've never written a web application that needs migrations then you're not writing the kinds of applications (or indeed any 'serious' application) that would benefit from Django IMO.
I have been programming for almost 15 years, and have never once needed to use migrations. I have worked on projects for multiple years with multiple developers and both small and large teams, and grew and scaled them, massively overhauled schemas, etc... and still, have never needed migrations.
Have I ever had to add a column to a database? Absolutely, have I ever needed a massively overcomplicated "migration" tool? Hell no, because I put more than 5 minutes of thought into my application logic and data structures before I even started writing code or designing the database schema.
I've single-handedly built, and maintained with a team, applications that did millions of dollars of revenue per day, with 10s of thousands of users per minute normally, with 100s of thousands per minute during peak load and used various styles of SQL data stores with between 20 and 50 tables on some projects (sounds pretty serious to me)... and still never once had a need for migrations.
My point is, if you need to lean on migrations even remotely often, you're doing something very very wrong.
I think, though, the type of developer using Django for its large pool of third-party extensions isn't really the type of developer who puts a lot of thought into what they are doing, though. Maybe I'll catch flak for this, but it's pretty true in my experience. It's the same as some front-end JS devs who sling various frameworks and libraries together, and end up with a pile of unmaintainable mess in the end...
Django is used to sling backend web apps together, fast, and needing migrations is evidence of that.
However, if one is doing proper clean development and following some simple best practices, an automated and complex migration should never be needed in the first place.
You get your DB model 100% correct the first time, always? Even years later your DB model is able to accommodate all those always changing business requirements?
> Just out of curiosity, have you had to maintain/modify any of these apps over these years and years?
Absolutely.
> If so, without any schema changes?
Usually, to be honest, yeah... Core, basal units of any given business process usually don't change. Only how they are abstracted, and that abstraction lives in code, not the database.
With a proper storage architecture it is very very very rare to need to change old data structures. It most definitely doesn't happen often enough to necessitate the use, overhead, and complexity of a baked-in migrations library.
If you think "adding a table" or "adding a field" is a "schema change" then that is where the problem lies... I think. Extending a schema should not require a migration library. CHANGING a schema, as in, destroying old data, and making new data, could surely use the help of a migration library, but, generally, if you're doing that often, you're doing something very wrong, and if you're not doing that often, then you definitely don't need a baked in migration library.
Ultimately, if things are so bad that you need a migrations library, you're better off fixing that shit at a low level, and starting over from scratch if necessary. Sometimes, instead of creating years and years of tech debt and burning resources on working with a clusterfuck, you just need to take what you can from that clusterfuck, and do it right. Then you only have a single migration, to go from old clusterfuck data to new extensible data structure.
Chances are, if you have old fucked up horrific data, it stays that way, and then some person or team comes along and writes a nice clean API on top of it. And then people have to maintain that horrific thing, and it's only horrific because it's interfacing with shit data layers in the first place.
Don't just abstract bad data architecture away into a service layer, is my point... Do it right.
Extensible is the key word here, but with an emphasis on decoupling.
If you have some dynamical data structure that is truly changing, structurally, often, and isn't simply being appended to or extended, then it probably should not be stored in a traditional SQL database in any way that necessitates schema changes. If you can't figure out how to represent data in an effective, decoupled, and extensible manner, then you're screwed from the get-go. Migrations library won't help you.
I'd like to know what I am doing wrong, as I use migrations all the time. The option is change the database later (and use migrations) or write a whole ton of crappy code to avoid using them. I have tried both ways and usually migrations work better.
I always thought a data model should be as agnostic to the system architecture as possible. Insofar I don't understand how a system architecture can help to avoid data model migrations.
With the knowledge I have now it would easy for me to say "and rightly so", but I think I can understand where you're coming from.
Django didn't always have schema migrations built in. I think they appeared in version 1.7 about 2 and a bit years ago. Before then there were separate add-on apps that could do migrations for you.
When Django added built-in support for migrations I still didn't use them for a while because at the time I was primarily a solo developer and wanted to have full control over what happened in the DB. And because I was a solo developer it didn't matter. So in that sense you're comment is in some ways correct.
However, in recent times I've started to work as part of a team. And that's where migrations really start to shine, because there might be several different people making app changes that result in DB schema changes. Django stores migration files with information about which other migrations each one is dependent upon.
Without that kind of migrations system it would simply be impossible for developers working on combinations of different branches of a development tree to build a working version of the code.
I am fully capable of doing everything done with migrations with SQL - but I always use migrations (even for custom sql outside of the ORM's capability). Why? because then we have a record of the schema state at any given moment (that is in sync with the application code at that point) and how it got to where it is now.
If you're leaning on migrations often, with SQL, then I think you should not be using SQL, obviously (or you're just doing something wrong) Or, one should use a SQL database that allows more abstract data types (like JSON blobs in Postgres)
I prefer to make use of the strong type system and inbuilt integrity checks that a database gives. json blobs mean reimplimenting this at app level. The people paying me are paying me to build their products not reinvent the wheel.
Migrations are an excellent tool if you aren't using them you are doing something wrong.+
Edit
+ assuming that you are working with a relational database and have an agileish development process or ever have requirements changes.
And how do you keep your database schema under version control?
Edit: Do all you downvoters not keep track of your schema changes as you develop your application (via migrations, or even sql files), so you can easily roll back changes if needed?
The models live in version control. It's that simple.
If your data structure is changing so often in such a way that you need to keep a history, then I'm afraid you have some fundamental problems that no amount of tooling or process can solve. You will not be able to scale your development efforts with this type of constantly changing data model, even with a migration library.
If these are truly problems that you or your team has, then you should hire someone to help you organize your application, data, and workflow in a more effective manner.
> I have been programming for almost 15 years, and have never once needed to use migrations.
> Have I ever had to add a column to a database? Absolutely,
... then you've used a migration. A statement like "I've never once had to use a migration" is so obviously incorrect that it's comical. Your contradiction two lines down makes it more so.
The rest of your comment is fairly misinformed about a few things, full of FUD, misses the point of Django migrations entirely and seems desperate to paint any use of migrations as the result of being a "bad programmer" rather than changing requirements or a natural part of development. Seeing as you've used migrations, you're a bad programmer as well.
As such I'm not going to bother writing a reply to the rest of it like I usually would.
Please keep this kind of programming flamewar off HN. If someone is wrong and you want to explain how, do so civilly.
We've unfortunately had to warn you a bunch of times already about being uncivil on HN. This process isn't infinite and ends in an account getting banned, so would you please fix this?
Please don't get involved in technical flamewars on HN (or other flamewars). As agitation goes up, information goes down, and these discussions turn into back-and-forth spats that benefit no one, including the combatants.
Okay, so you changed the DB schema in your 'dev' environment to add a column.
1. In what manner/format do you define or commit the change to VCS so it is in sync with the corresponding code change (that uses the new column)?
2. What happens when a teammate comes to deploy that schema change to production?
3. What happens when we need to roll back production to some previous commit? The need to rollback may/may-not relate to the schema change (not suggesting you 'got it wrong').
If you can answer 1, 2, 3 by some convention ... you've created a migration system that should probably be codified.
1. Change model code to add the new field, test, and commit it to version control. Then it's merged to staging, where it's tested in a production-like environment with a recent production data snapshot. After that's approved, it goes to master/production servers.
2. git pull then restart some wsgi/worker processes
3. git checkout to desired revision and restart some wsgi/worker processes
It's perfectly fine to have models comprised of only a subset of the data they interact with.
The models do not have to reflect all fields that exist in the database.
The models only reflect information that is necessary for the application in which said models are used.
Basically, if you clobber your database, no amount of migration tooling or process will help you. Ever. It will just get in the way, to be honest, and will prevent you from doing things properly in the first place.
You missed the point of all those questions (e.g. in 2 you don't mention how the schema change gets applied to the prod database). Unfortunately I'm frankly out of patience explaining any further, so we'll have to leave it there.
Again, FUD. If you really need me to spell out why then I will, but I doubt it will change your viewpoint.
Let me spell it out for you:
You build your awesome [app] for [client]. You get your schema bang on first time and it's all working absolutely fine. Everyone is happy.
Then [client] comes to you and says "[app] is awesome, but we need [x]". You think, and [x] needs some database modifications. Maybe it's a new column to store something, or perhaps it's a column that now shouldn't be nullable. Maybe it's a new table entirely. Perhaps what you need is already implemented in library [y], which needs database modifications of some kind like it's own tables. The possibilities are endless.
Great. So how do we do this. Obviously we need to run some SQL on the database. So you change all your code, press deploy, and really quickly log onto the production server and smash out some artisanal SQL to modify the database before any requests hit the server. Everyone is happy! You quickly write an email all your co-workers to do these migrations locally so they can run the app.
Oh no, but wait, you forgot to do operation Z! Now your app and everyone elses is broken. You resolve to ensure this doesn't happen again by writing a .sql file for each migration and running them in sequence. Great, you can even check them into your VCS and share them with your co-workers. Awesome! You can even make them part of your CI build (you do that, right?).
You do another deploy, but nuts, something went wrong! You need to roll a migration back! But you only added forward migrations in your .sql files! Ok, so lets add [migration]-backwards.sql and [migration]-forwards.sql from now on. Awesome! So you've got a forwards and backwards migrations. But... wait, we need to run some Python code to do something as part of the migration. Ok... lets add a [migration]-{forwards}.py as well.
Fantastic. But then you start to install package [z] because it does exactly what the client wants and you realize the benefits of using well tested, widely used libraries in your systems and not writing everything by hand. This requires some database modifications. The author of this package has the migrations it needs but it's in it's own home-grown format and it's only for MySQL and Oracle! Nuts, ok, so you translate them by hand and add them to your migrations. But you made a mistake somewhere in the translation and everything breaks. You fix it by hand.
Congratulations. You've just written a shit version of Django migrations that everyone hates and that doesn't really work.
tl;dr: schema changes are a natural part of development. You, me, and everyone else working on any kind development project has done them. They happen, this is a fact. If you need to:
1. Share the migrations in your team
2. Run them as part of your CI build
3. Use third party migrations for packages (including built in contenttypes or other core Django tables)
4. Have them in a format that works for any supported database
5. Have rollback and the ability to execute arbitrary code as part of the migration
Then you can either write your own shitty version or use a well supported, built in system that handles the complexity for you. This is not a bad thing.
I can't comprehend your viewpoint, which mostly boils down to "these damn kids are on my lawn", so I can only imagine your setup is some ad-hoc .sql files that you run by hand.
You are setting up a comical hypothetical situation in which the developer is blundering through their job in a mindless inept manner.
I'll just stop you right there.
No amount of best practices, process, or tooling will fix an incompetent developer who is part of a disorganized team that are working on a pile of bad architecture.
> However, they should take it a step further, and just avoid Django in the first place.
Django is a tool, and like most tools it has a use. I find Django indispensable for writing specific kinds of applications, and it's admin interface is by far the best thing since sliced bread for internal/backoffice applications. It's not perfect by any means but it's amazing for quickly getting a CRUD app with an awesome administration interface up and running with minimal effort. Django rest framework, migrations and the amazing ecosystem (reversion, django-mptt, django-polymorphic, debug toolbar, django-currencies to name very few) are also what makes Django appealing and rather awesome.
Making a blanket statement like 'Avoid Django' is rather silly, given all that.
> Also, am I the only one that has never needed to use "migrations" or any similar sorts of features?
I'm not sure why you put quotes around migrations as if it's some alien, obscure or weird feature. If you've never written a web application that needs migrations then you're not writing the kinds of applications (or indeed any 'serious' application) that would benefit from Django IMO.