https://github.com/andreacrotti/ep2013
my talks and notes from Europython 2013
https://github.com/andreacrotti/ep2013
Last synced: 10 months ago
JSON representation
my talks and notes from Europython 2013
- Host: GitHub
- URL: https://github.com/andreacrotti/ep2013
- Owner: AndreaCrotti
- Created: 2013-06-22T16:29:11.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2013-10-22T10:43:00.000Z (about 12 years ago)
- Last Synced: 2025-02-01T08:31:05.239Z (12 months ago)
- Language: Python
- Size: 717 KB
- Stars: 3
- Watchers: 4
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.org
Awesome Lists containing this project
README
Notes from other talks and various ideas collected
* TODO find out how to do the rst-class: build correctly with Hierloglyph
* TDD
** TODO add distinction between integration and unit tests and when mocking comes into consideration
** TODO add a way to set up correctly the whole flow in Emacs with the right setup hook
* TODO propose a lighting talk about Emacs showing the whole configuration
- *from zero to intellisense* with Jedi and other libraries.
- show how to install things on a clean Emacs environment
* Interesting links
- [[https://tahoe-lafs.org/trac/tahoe-lafs][taohe-lafs]]
- python-rex, to use Perl-like regex in Python easily (using a cache)
- iktomi, something used to build web apps
- getsentry, use the hosted version
- mailing list: code-quality@python.org
* Sentry
- doing business with a completely open source project
- don't try to compute with people hosting themselves
- try to minimize companies that are more a pain than anything else
- do the *project you love* and users will also love it
* Disqus
- phabricator to view the diffs
- review that
- CI done by jenkins (before commited)
- push with GIT
- put all the commits in a daily deploy
- 0-8 daily deploys
django + celery + postgres + rabbit
** Problems with current stack
- high concurrency
- need to isolate things
- isolate features (to fasten the loop)
- fun and experimentation
(nginx modules for example, or LUA)
** Other interesting things
- use mixins (do only one thing)
Some libraries done by Disqus:
- django-mailviews
- nydus (for redis)
- gargoyle (for application switches)
SF jobs, seems interesting.
*** TODO suggest a possible way to force the hooks
* Static analysis
- tokenize module handles all the tokenization of things
* Emacs show
Create a minimal working.
- auto completion with Jedi, and code navigation
- handling virtualenv
- git integration with magit
- running tests
- jumping to things easily
ed solution:
https://www.getsentry.com/pricing/
* Git internals
- git uses SHA1 hashes to keep track of everything, which is like a fingerprint
- use *git cat-file -p/-t* to show what is going on
- traditional SCM use delta differences, which would sound more efficient, but
in git there is a copy of the whole tree, with linked lists used.
- you can use "git checkout attach-head" to attach after you're actually detached
* Next iteration of GUI applications
- generated by generator/iterator ideas, using PEP 342 and PEP 380
* Building to scale
** SQL
- Disqus runs on Postgres.
- You can use SQL by doing more or less everything.
- scaling is about predictability
- what can we do to improve SQL?
- Redis is the best simple technology that works
** Caches
*** Counters
- Using Redis as a cache for updating things
- Redis nodes are easily horizontally scalable
- the SQL contention are solved in the DB anyway
- can use Redis instead of celery for example to use things
*** Queuing
Using always Celery and RabbitMQ for example. If you're limited
in memory it can be a problem with Redis, so better not to use it
for this particular task.
Example can be an async task.
@task(queue="event creation")
def on_event_creation(event_id):
...
Keep every jobs that need to be moved to a task small.
*** Object caching
Most of these things are denormalized, we don't want to use joins
and selects for things that are fast.
You need caching when *your database can't handle the load
anymore*, and only if your data doesn't change too frequently.
You can expect even worse performances! So it depends a lot on
the kind of data.
The right way to cache something is to always save the object in
the cache whenever we save and modify (PUSH-only, not PULL-PUSH).
There must be a plan on how things should scale.
*** Redis
*redis* must fit in memory, you can't split the data without
bringing it down.
Getting a list of keys for example blocks everything.
*SQL is good*, don't replace it!
* SPDY
Perché?
- più grafica
- più velocità
- applicazioni più interattive (angular / ember)
Ancora scrivendo usando HTTP/1.1 per scrivere applicazione (dal '99)
Django + SPDY
+ django
+ django-jython
+ ..
* Function annotation
- small metadata that can be added in the function definition
- PEP 3107 and Python 3.x only
def greet(name: str, age: int) -> str:
- rightarrow, nice library that defines a language to define types
* Useful libraries
- pyenv
- gorun
* Metaclasses
change the behaviour of *all* the possible objects! Getting a
different behaviour from all the classes that subclass object from
now on.
class new_object(object):
__metaclass__ = DebugMeta
__builtin__.object = new_object
* Large scale applications
> 100k lines of code
- easy to write big applications
- faster to write
- no lock-in
- be ahead of the game and react to customer
Typical scenario:
+ complex interactions
+ complex work concepts
** Zeon of application design
- only use objects for everything (and no primitives like dictionaries and tuples)
- different modules should not store any state!
- use a request.context object to carry the information around in the system
* Elasticsearch
- query will return what matches and the score about the match
- use a filter unless you really need a query
"multi_match": ..."title^10", "body" (^10 boost the field title to
make it more important)
Using a parent-child relationship to get faster a result from ES.
* TODO change the keyboard layout to italian
* Continuos integration
- use travis, bamboo or something like this
that sets you up well enough out of the box