{"id":16778299,"url":"https://github.com/deeplook/pydata_berlin2016_materials","last_synced_at":"2025-07-23T22:38:36.858Z","repository":{"id":66092387,"uuid":"59344929","full_name":"deeplook/pydata_berlin2016_materials","owner":"deeplook","description":"Collection of pointers to slides and repositories from speakers at PyData Berlin 2016","archived":false,"fork":false,"pushed_at":"2016-06-30T09:34:38.000Z","size":24,"stargazers_count":37,"open_issues_count":0,"forks_count":32,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-16T19:26:15.142Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deeplook.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-05-21T06:12:54.000Z","updated_at":"2025-01-04T16:42:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"944149d9-661c-44ca-a550-af9f8a2229db","html_url":"https://github.com/deeplook/pydata_berlin2016_materials","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/deeplook/pydata_berlin2016_materials","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeplook%2Fpydata_berlin2016_materials","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeplook%2Fpydata_berlin2016_materials/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeplook%2Fpydata_berlin2016_materials/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeplook%2Fpydata_berlin2016_materials/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deeplook","download_url":"https://codeload.github.com/deeplook/pydata_berlin2016_materials/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeplook%2Fpydata_berlin2016_materials/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266761449,"owners_count":23980298,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-23T02:00:09.312Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-13T07:27:22.115Z","updated_at":"2025-07-23T22:38:36.833Z","avatar_url":"https://github.com/deeplook.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"PyData Berlin 2016 Materials\n============================\n\n\nKeynotes\n--------\n\nOlivier Grisel, Predictive Modelling with Python\n\n- http://ogrisel.github.io/decks/2016_pydata_berlin/\n- https://github.com/ogrisel/docker-distributed\n\n\nJulia Evans, How to trick a neural network\n\n- http://jvns.ca/blog/2016/05/21/a-few-notes-from-my-pydata-berlin-keynote/\n\n\nWe McKinney, Python Data Ecosystem: Thoughts on Building for the Future\n\n- http://de.slideshare.net/wesm/python-data-ecosystem-thoughts-on-building-for-the-future\n\n\nRegular\n-------\n\nDaniel Kirsch, Functional Programming in Python\n\n- https://github.com/kirel/functional-python\n\n\nTrent McConaghy, BigchainDB: a Scalable Blockchain Database, in Python\n\n- https://github.com/bigchaindb/bigchaindb\n\n\nDavid Higgins, Introduction to Julia for Python programmers\n\n- https://github.com/daveh19/pydataberlin2016\n\n\nKatharina Rasch, What every Data Scientist should know about data anonymization\n\n- https://github.com/krasch/presentations/blob/master/pydata_Berlin_2016.pdf\n\n\nAlexander Sibiryakov, Frontera: open source, large scale web crawling framework\n\n- https://github.com/scrapinghub/frontera\n\n\nThomas Reineking, Plumbing in Python: Pipelines for Data Science Applications\n\n- Yamal: Not yet Opensourced\n\n\nRyan Henderson, image-match: a python library for searching for similar images in large corpora\n\n- https://github.com/ascribe/image-match\n\n\nIan Ozsvald, Statistically Solving Sneezes and Sniffles (a work in progress)\n\n- https://speakerdeck.com/ianozsvald/statistically-solving-sniffles-step-by-step-a-work-in-progress\n- http://ianozsvald.com/2016/05/07/statistically-solving-sneezes-and-sniffles-a-work-in-progress-report-at-pydatalondon-2016/\n\n\nFelix Biessmann, Predicting Political Views From Text\n\n- https://github.com/felixbiessmann/\n\n\nJie Bao, ExpAn - A Python Library for A/B Testing Analysis\n\n- https://github.com/zalando/expan\n- http://www.slideshare.net/JieBao3/expan-presentation-pydata-berlin-2016\n\n\nAnne Matthies, Zero-Administration Data Pipelines using AWS Simple Workflow\n\n- https://github.com/babbel/floto\n\n\nDaniel Moisset, Bridging the gap: from Data Science to service\n\n- https://github.com/machinalis/slides/tree/master/data-science-to-service\n\n\nKatharine Jarmul, Holy D@t*! How to Deal with Imperfect, Unclean Datasets\n\n- https://docs.google.com/presentation/d/1G-lgHKTdrqeeJhcvVmd7C9gOIfTRe429zhBN6lmKKzA/\n\n\nNora Neumann, Usable A/B testing – A Bayesian approach\n\n- https://speakerdeck.com/nneu/b-testing-a-bayesian-approach\n\n\nFrank Kaufer, Building Polyglot Data Science Platform on Big Data Systems\n\n- https://speakerdeck.com/fkaufer/polyglot-data-science-platforms-on-big-data-systems\n\n\nLukasz Czarnecki, Brand recognition in real-life photos using deep learning\n\n- http://de.slideshare.net/ukaszCzarnecki/brand-recognition-in-reallife-photos-using-deep-learning-lukasz-czarnecki-pydata-berlin-2016/\n\n\nEdouard Fouché, Accelerating Python Analytics by In-Database Processing\n\n- https://ibmdbanalytics.github.io/pydata-berlin-2016-ibmdbpy.slides.html\n\n\nDelia Rusu, Estimating stock price correlations using Wikipedia\n\n- https://speakerdeck.com/deliarusu/estimating-stock-price-correlations-using-wikipedia\n- https://github.com/deliarusu/wikipedia-correlation\n\n\nJakob van Santen, The IceCube data pipeline from the South Pole to publication\n\n- http://icecube.wisc.edu/~jvansanten/pasties/slides/2016-05-21%20PyData.pdf\n\n\nMoritz Neeb, Bayesian Optimization and it's application to Neural Networks\"\n\n- https://slack-files.com/T18U1ASNQ-F1AHX36HG-22a535f1a2\n\n\nKashif Rasul, What's new in Deep Learning?\n\n- https://bitly.com/new-deep-learning\n- https://bitly.com/cifar10-resnet\n\n\nNathan Epstein, Machine Learning at Scale\n\n- https://github.com/NathanEpstein/pydata-berlin\n\n\nRonert Obst and Dat Tran, PySpark in Practice\n\n- http://pydata2016.cfapps.io/#/\n\n\nJose Quesada, A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and cons\n\n- https://files3.mixmaxusercontent.com/Qb5xzixaAsFjdsNbn/f/TZIGovHLNm7Is9Z5P/?messageId=lC47ZAHVG9riuwJkc\u0026rn=Iibh1mclh2RgUnbpRkI\u0026re=ISZk5ibpxmclJWLulmLul2dyFGZA5WYtJXZodmI\n\n\nMartina Pugliese, Spotting trends and tailoring recommendations: PySpark on Big Data in fashion\n\n- https://github.com/martinapugliese/talks_presentations/tree/master/pydata_berlin_2016\n\nAngelos Kapsimanis, The Simple Leads To The Spectacular (Cancelled)\n\nAnton Dubrau, Using small data in the client instead of big data in the cloud\n\n- did not respond, yet\n\nNils Magnus, Dealing with TBytes of Data in Realtime\n\n- did not respond, yet\n\nAbhishek Thakur, Classifying Search Queries without User Click Data\n\n- did not respond, yet\n\nJessica Palmer, Python and TouchDesigner for Interactive Experiments\n\n- did not respond, yet\n\nMaciej Gryka, Removing Soft Shadows with Hard Data\n\n- did not respond, yet\n\nAndreas Lattner, Setting up predictive analytics services with Palladium\n\n- did not respond, yet\n\nAndrej Warkentin, Visualizing FragDenStaat.de\n\n- did not respond, yet\n\nJames Powell, The kwarg problem\n\n- did not respond, yet\n\nMatthew Honnibal, Designing spaCy: A high-performance natural language processing (NLP) library written in Cython\n\n- did not respond, yet\n\nValentine Gogichashvili, Data Integration in the World of Microservices\n\n- did not respond, yet\n\nMichelle Tran Chain, Loop \u0026 Group: How Celery Empowered our Data Scientists to Take Control of our Data Pipeline\n\n- did not respond, yet\n\nGuertel Idai, Artificial Body Representation in Robots, Expectation and Surprise\n\n- did not respond, yet\n\nRobert Meyer, pypet: A Python Toolkit for Simulations and Numerical Experiments\n\n- did not respond, yet\n\nJuha Suomalainen, Visualizing research data: Challenges of combining different datasources\n\n- did not respond, yet\n\nDanny Bickson, Python based predictive analytics with GraphLab Create\n\n- did not respond, yet\n\nFang Xu, Connecting Keywords to Knowledge Base Using Search Keywords and Wikidata\n\n- did not respond, yet\n\nDr. Markus Abel, Python Learns to Control Complex Systems\n\n- did not respond, yet\n\n\nTutorials\n---------\n\nFrank Gerhardt, Using Spark - with PySpark\n\n- https://gitlab.com/gerhardt.io/pyspark-workshop\n\nMike Müller, Single-source Python 2/3\n\n- http://www.python-academy.com/download/pydatabln2016/Single_Source_Python_2_3.pdf\n\nKatharine Jarmul, Data Wrangling with Python\n\n- https://github.com/kjam/data-wrangling-pycon\n\nLev Konstantinovskiy, Practical Word2vec in Gensim\n\n- https://github.com/RaRe-Technologies/movie-plots-by-genre\n\nShoaib Burq, Which city is the cultural capital of Europe? An introduction to Apache PySpark for GeoAnalytics\n\n\nLightning Talks\n---------------\n\nOliver Zeigermann\n\n- https://djcordhose.github.io/big-data-visualization/2016_pydata_berlin_lightning.html#/\n\n\nPiotr Migdał, Teaching machine learning\n\n- https://speakerdeck.com/pmigdal/teaching-machine-learning\n- http://p.migdal.pl/2016/03/15/data-science-intro-for-math-phys-background.html\n\nMentioned tools:\n\n- Pybuilder: Tired of writing setup.py? http://pybuilder.github.io/\n- Sputnik: Package manager for Data https://github.com/spacy-io/sputnik\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeplook%2Fpydata_berlin2016_materials","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeeplook%2Fpydata_berlin2016_materials","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeplook%2Fpydata_berlin2016_materials/lists"}