An open API service indexing awesome lists of open source software.

https://github.com/andreacrotti/postgres-migration


https://github.com/andreacrotti/postgres-migration

Last synced: 5 days ago
JSON representation

Awesome Lists containing this project

README

          

#+AUTHOR: Andrea Crotti @andreacrotti
#+TITLE: Django: From MySQL to Postgres
#+OPTIONS: num:nil ^:nil toc:nil timestamp:nil reveal_single_file:t
#+REVEAL_TRANS: fade
#+REVEAL_SPEED: fast
#+REVEAL_PLUGINS: notes
#+EMAIL: andrea.crotti@iwoca.co.uk

* Why

[[./images/postgresql_versus_mysql.jpg]]

* Why, really

This sucks:

#+BEGIN_SRC sql
SELECT X.* FROM no_chain_samplemodel as X
JOIN (SELECT user_id, MAX(timestamp) AS timestamp
FROM no_chain_samplemodel
GROUP BY user_id) AS Y
ON (X.user_id = Y.user_id and X.timestamp = Y.timestamp)i
WHERE X.staff_id = %s

#+END_SRC

This is great:

#+BEGIN_SRC sql

SELECT DISTINCT ON (user_id) FROM no_chain_samplemodel
WHERE timestamp <= '2017-01-01' ORDER BY user_id ASC, timestamp DESC;

#+END_SRC

With the Django ORM:

#+BEGIN_SRC python

SampleModel.objects.distinct('user_id').\
filter(timestamp__gt=mydate).order_by('user_id', '-timestamp')

#+END_SRC

* How

*From this*

#+BEGIN_SRC python
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
...
}
}
#+END_SRC

*To this*

#+BEGIN_SRC python
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
...
}
}
#+END_SRC

* Done?

[[./images/done_yet.png]]

* A few numbers

Massive fintech Django project:

- 190k lines of Python code
- > 100 Apps
- 383 tables to migrate
- 3000 tests running in ~3 minutes

* What's the plan

- adapt code
- migrate data
- profit!

* Data migration

** Pgloader

*Pgloader* to the rescue:

[[./images/pgloader.png]]

*No live replication* == *Downtime!!!*

** Very big tables

- drop foreign keys (ForeignKey → IntegerField)
- adapt queries
- write a database router

* Code changes

- get Postgres on CI (stable tests)
- search for untested raw queries
- manual testing on real data

** Regression testing

[[./images/notebook.png]]

* Tips for switchers

- *use* migrations for everything
- *test* everything
- *NEVER* rely on implicit ordering
- make Django apps really independent
- split that monolith ASAP

* Conclusions

#+BEGIN_QUOTE

Hofstadter's Database Migration Law:

Migrating from MySQL to Postgres always takes longer than you expect, even when you take into account Hofstadter's Law.

#+END_QUOTE

@andreacrotti https://www.iwoca.co.uk/